網絡上有各種各樣的win7 64bit安裝theano的方法,我也試過好多,各種各樣的問題。因為之前沒了解過MinGw等東西,所以安裝起來比較費勁,經過不斷的嘗試,最終我按照以下過程安裝成功。
其實過程很簡單,首先說一下安裝條件:
- win10 (32和64都可以,下載安裝包時一定要選擇對應的)
- vs2010(不一定非要是vs2010,恰好我有vs2010,應該是配置GPU編程時需要用到vs的編譯器)
- Anaconda(轉到官方下載,打開之后稍微等一會就會出來下載鏈接了。之所以選擇它是因為它內置了python,以及numpy、scipy兩個必要庫和一些其他庫,比起自己安裝要省事。至於版本隨便選擇了,如果想安裝python3.4就下載對應的Anaconda3。本教程使用Anaconda,也就是對應的python2.7版本。安裝過程無差別。)
安裝過程:
一、卸載之前版本。
把之前單獨安裝的Python等統統卸載掉。學python的時候直接安裝了python2.7,先把他卸載掉,因為Anaconda里邊包含了python。
二、安裝Anaconda。
這個超級簡單,安裝目錄我用的是的 D:\Anaconda2 。這個特別要注意:安裝路徑千萬不要有空格!!!血的教訓
三、安裝MinGw。
其他教程講在環境變量中添加 path D:\Anaconda2\MinGW\bin;D:\Anaconda2\MinGW\x86_64-w64-mingw32\lib; ,但是你會發現 D:\Anaconda2\ 下面根本沒有MinGw這個目錄,所以最好的方法就是用命令安裝,不需要自己下載什么mingw-steup.exe等。
安裝方法:
- 打開CMD(注意是windows命令提示符,並不是進入到python環境下,否則會提示語法錯誤,因為conda命令就是在windows下面執行的。);
- 輸入conda install mingw libpython,然后回車,會出現安裝進度,稍等片刻即可安裝完畢。此時就有D:\Anaconda2\MinGw目錄了。
四、配置環境變量。
- 編輯用戶變量中的path變量(如果沒有就新建一個,一般會有的),在后邊追加D:\Anaconda2;D:\Anaconda2\Scripts; 不要漏掉分號,此處因為我的Anaconda的安裝目錄是D:\Anaconda2,此處需要根據自己的安裝目錄填寫。
- 在用戶變量中新建變量pythonpath,變量值為D:\Anaconda2\Lib\site-packages\theano; ,此處就是指明安裝的theano的目錄是哪,但是現在咱們還沒有安裝,所以不着急,先寫完再說。
- 打開cmd,會看到窗口里邊有個路徑,我的是C:\Users\Administrator>,根據自己的路徑,找到對應的目錄,在該目錄下新建一個文本文檔.theanorc.txt (注意有兩個“.”),編輯它,寫入以下內容:
[global]
openmp=False
[blas]
ldflags=
[gcc]
cxxflags=-ID:\Anaconda2\MinGW
其中紅體字部分是你安裝的Anaconda的路徑,一定不要弄錯。否則找不到MinGw。 - 最好重啟一下電腦。
五、安裝Theano。
不需要手動下載zip等壓縮包,直接用命令安裝最簡單。
- 打開CMD,方法和安裝MinGw一樣,不要進入python。
- 輸入pip install theano,回車后就是賞心悅目的下載進度條,這個很小,所以安裝的比較快。
-
這里我的安裝出現了pip命令不能識別的問題
-
Unable to create process using '""
-
暫時用 python -m pip install theano來代替了
-
- 在cmd中,輸入python 進入到python環境下,然后先輸入import theano回車,需要等一段時間。
- 繼續輸入theano.test()。又會輸出好長一段信息,只要沒有error就說明安裝成功了。我安裝時等了一段時間還在輸出,我就ctrl+c退出了。(其實我發現,有部分error信息也沒有關系,theano的功能也可以正常使用,包括theano.function(),所以如果有同學無論如何配置還是有error信息的話,可以暫時忽略掉,直接跑一段程序試一下,可以去測試一下卷積操作運算代碼。
六、使用GPU
因為博主電腦是AMD的顯卡,CUDA顯然不支持,也不用想把GPU利用起來。
七、深度學習框架Keras
- 打開CMD,方法和安裝MinGw一樣,不要進入python。
- 輸入pip install theano,回車后就是賞心悅目的下載進度條。
同樣pip命令識別不了,用的 python -m pip install keras代替
注:在Anaconda Prompt中是識別pip命令的,上述兩個pip命令也可以直接在這里面裝,效果是一樣的。
八、小例子
1、theano測試
1 from __future__ import print_function 2 """ 3 Created on Tue Aug 16 14:05:45 2016 4 5 @author: Administrator 6 """ 7 8 """ 9 This tutorial introduces logistic regression using Theano and stochastic 10 gradient descent. 11 12 Logistic regression is a probabilistic, linear classifier. It is parametrized 13 by a weight matrix :math:`W` and a bias vector :math:`b`. Classification is 14 done by projecting data points onto a set of hyperplanes, the distance to 15 which is used to determine a class membership probability. 16 17 Mathematically, this can be written as: 18 19 .. math:: 20 P(Y=i|x, W,b) &= softmax_i(W x + b) \\ 21 &= \frac {e^{W_i x + b_i}} {\sum_j e^{W_j x + b_j}} 22 23 24 The output of the model or prediction is then done by taking the argmax of 25 the vector whose i'th element is P(Y=i|x). 26 27 .. math:: 28 29 y_{pred} = argmax_i P(Y=i|x,W,b) 30 31 32 This tutorial presents a stochastic gradient descent optimization method 33 suitable for large datasets. 34 35 36 References: 37 38 - textbooks: "Pattern Recognition and Machine Learning" - 39 Christopher M. Bishop, section 4.3.2 40 41 """ 42 43 44 45 __docformat__ = 'restructedtext en' 46 47 import six.moves.cPickle as pickle 48 import gzip 49 import os 50 import sys 51 import timeit 52 53 import numpy 54 55 import theano 56 import theano.tensor as T 57 58 59 class LogisticRegression(object): 60 """Multi-class Logistic Regression Class 61 62 The logistic regression is fully described by a weight matrix :math:`W` 63 and bias vector :math:`b`. Classification is done by projecting data 64 points onto a set of hyperplanes, the distance to which is used to 65 determine a class membership probability. 66 """ 67 68 def __init__(self, input, n_in, n_out): 69 """ Initialize the parameters of the logistic regression 70 71 :type input: theano.tensor.TensorType 72 :param input: symbolic variable that describes the input of the 73 architecture (one minibatch) 74 75 :type n_in: int 76 :param n_in: number of input units, the dimension of the space in 77 which the datapoints lie 78 79 :type n_out: int 80 :param n_out: number of output units, the dimension of the space in 81 which the labels lie 82 83 """ 84 # start-snippet-1 85 # initialize with 0 the weights W as a matrix of shape (n_in, n_out) 86 self.W = theano.shared( 87 value=numpy.zeros( 88 (n_in, n_out), 89 dtype=theano.config.floatX 90 ), 91 name='W', 92 borrow=True 93 ) 94 # initialize the biases b as a vector of n_out 0s 95 self.b = theano.shared( 96 value=numpy.zeros( 97 (n_out,), 98 dtype=theano.config.floatX 99 ), 100 name='b', 101 borrow=True 102 ) 103 104 # symbolic expression for computing the matrix of class-membership 105 # probabilities 106 # Where: 107 # W is a matrix where column-k represent the separation hyperplane for 108 # class-k 109 # x is a matrix where row-j represents input training sample-j 110 # b is a vector where element-k represent the free parameter of 111 # hyperplane-k 112 self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b) 113 114 # symbolic description of how to compute prediction as class whose 115 # probability is maximal 116 self.y_pred = T.argmax(self.p_y_given_x, axis=1) 117 # end-snippet-1 118 119 # parameters of the model 120 self.params = [self.W, self.b] 121 122 # keep track of model input 123 self.input = input 124 125 def negative_log_likelihood(self, y): 126 """Return the mean of the negative log-likelihood of the prediction 127 of this model under a given target distribution. 128 129 .. math:: 130 131 \frac{1}{|\mathcal{D}|} \mathcal{L} (\theta=\{W,b\}, \mathcal{D}) = 132 \frac{1}{|\mathcal{D}|} \sum_{i=0}^{|\mathcal{D}|} 133 \log(P(Y=y^{(i)}|x^{(i)}, W,b)) \\ 134 \ell (\theta=\{W,b\}, \mathcal{D}) 135 136 :type y: theano.tensor.TensorType 137 :param y: corresponds to a vector that gives for each example the 138 correct label 139 140 Note: we use the mean instead of the sum so that 141 the learning rate is less dependent on the batch size 142 """ 143 # start-snippet-2 144 # y.shape[0] is (symbolically) the number of rows in y, i.e., 145 # number of examples (call it n) in the minibatch 146 # T.arange(y.shape[0]) is a symbolic vector which will contain 147 # [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of 148 # Log-Probabilities (call it LP) with one row per example and 149 # one column per class LP[T.arange(y.shape[0]),y] is a vector 150 # v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ..., 151 # LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is 152 # the mean (across minibatch examples) of the elements in v, 153 # i.e., the mean log-likelihood across the minibatch. 154 return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y]) 155 # end-snippet-2 156 157 def errors(self, y): 158 """Return a float representing the number of errors in the minibatch 159 over the total number of examples of the minibatch ; zero one 160 loss over the size of the minibatch 161 162 :type y: theano.tensor.TensorType 163 :param y: corresponds to a vector that gives for each example the 164 correct label 165 """ 166 167 # check if y has same dimension of y_pred 168 if y.ndim != self.y_pred.ndim: 169 raise TypeError( 170 'y should have the same shape as self.y_pred', 171 ('y', y.type, 'y_pred', self.y_pred.type) 172 ) 173 # check if y is of the correct datatype 174 if y.dtype.startswith('int'): 175 # the T.neq operator returns a vector of 0s and 1s, where 1 176 # represents a mistake in prediction 177 return T.mean(T.neq(self.y_pred, y)) 178 else: 179 raise NotImplementedError() 180 181 182 def load_data(dataset): 183 ''' Loads the dataset 184 185 :type dataset: string 186 :param dataset: the path to the dataset (here MNIST) 187 ''' 188 189 ############# 190 # LOAD DATA # 191 ############# 192 193 # Download the MNIST dataset if it is not present 194 data_dir, data_file = os.path.split(dataset) 195 if data_dir == "" and not os.path.isfile(dataset): 196 # Check if dataset is in the data directory. 197 new_path = os.path.join( 198 os.path.split(__file__)[0], 199 "..", 200 "data", 201 dataset 202 ) 203 if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz': 204 dataset = new_path 205 206 if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz': 207 from six.moves import urllib 208 origin = ( 209 'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz' 210 ) 211 print('Downloading data from %s' % origin) 212 urllib.request.urlretrieve(origin, dataset) 213 214 print('... loading data') 215 216 # Load the dataset 217 with gzip.open(dataset, 'rb') as f: 218 try: 219 train_set, valid_set, test_set = pickle.load(f, encoding='latin1') 220 except: 221 train_set, valid_set, test_set = pickle.load(f) 222 # train_set, valid_set, test_set format: tuple(input, target) 223 # input is a numpy.ndarray of 2 dimensions (a matrix) 224 # where each row corresponds to an example. target is a 225 # numpy.ndarray of 1 dimension (vector) that has the same length as 226 # the number of rows in the input. It should give the target 227 # to the example with the same index in the input. 228 229 def shared_dataset(data_xy, borrow=True): 230 """ Function that loads the dataset into shared variables 231 232 The reason we store our dataset in shared variables is to allow 233 Theano to copy it into the GPU memory (when code is run on GPU). 234 Since copying data into the GPU is slow, copying a minibatch everytime 235 is needed (the default behaviour if the data is not in a shared 236 variable) would lead to a large decrease in performance. 237 """ 238 data_x, data_y = data_xy 239 shared_x = theano.shared(numpy.asarray(data_x, 240 dtype=theano.config.floatX), 241 borrow=borrow) 242 shared_y = theano.shared(numpy.asarray(data_y, 243 dtype=theano.config.floatX), 244 borrow=borrow) 245 # When storing data on the GPU it has to be stored as floats 246 # therefore we will store the labels as ``floatX`` as well 247 # (``shared_y`` does exactly that). But during our computations 248 # we need them as ints (we use labels as index, and if they are 249 # floats it doesn't make sense) therefore instead of returning 250 # ``shared_y`` we will have to cast it to int. This little hack 251 # lets ous get around this issue 252 return shared_x, T.cast(shared_y, 'int32') 253 254 test_set_x, test_set_y = shared_dataset(test_set) 255 valid_set_x, valid_set_y = shared_dataset(valid_set) 256 train_set_x, train_set_y = shared_dataset(train_set) 257 258 rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y), 259 (test_set_x, test_set_y)] 260 return rval 261 262 263 def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000, 264 dataset='mnist.pkl.gz', 265 batch_size=600): 266 """ 267 Demonstrate stochastic gradient descent optimization of a log-linear 268 model 269 270 This is demonstrated on MNIST. 271 272 :type learning_rate: float 273 :param learning_rate: learning rate used (factor for the stochastic 274 gradient) 275 276 :type n_epochs: int 277 :param n_epochs: maximal number of epochs to run the optimizer 278 279 :type dataset: string 280 :param dataset: the path of the MNIST dataset file from 281 http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz 282 283 """ 284 datasets = load_data(dataset) 285 286 train_set_x, train_set_y = datasets[0] 287 valid_set_x, valid_set_y = datasets[1] 288 test_set_x, test_set_y = datasets[2] 289 290 # compute number of minibatches for training, validation and testing 291 n_train_batches = train_set_x.get_value(borrow=True).shape[0] // batch_size 292 n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] // batch_size 293 n_test_batches = test_set_x.get_value(borrow=True).shape[0] // batch_size 294 295 ###################### 296 # BUILD ACTUAL MODEL # 297 ###################### 298 print('... building the model') 299 300 # allocate symbolic variables for the data 301 index = T.lscalar() # index to a [mini]batch 302 303 # generate symbolic variables for input (x and y represent a 304 # minibatch) 305 x = T.matrix('x') # data, presented as rasterized images 306 y = T.ivector('y') # labels, presented as 1D vector of [int] labels 307 308 # construct the logistic regression class 309 # Each MNIST image has size 28*28 310 classifier = LogisticRegression(input=x, n_in=28 * 28, n_out=10) 311 312 # the cost we minimize during training is the negative log likelihood of 313 # the model in symbolic format 314 cost = classifier.negative_log_likelihood(y) 315 316 # compiling a Theano function that computes the mistakes that are made by 317 # the model on a minibatch 318 test_model = theano.function( 319 inputs=[index], 320 outputs=classifier.errors(y), 321 givens={ 322 x: test_set_x[index * batch_size: (index + 1) * batch_size], 323 y: test_set_y[index * batch_size: (index + 1) * batch_size] 324 } 325 ) 326 327 validate_model = theano.function( 328 inputs=[index], 329 outputs=classifier.errors(y), 330 givens={ 331 x: valid_set_x[index * batch_size: (index + 1) * batch_size], 332 y: valid_set_y[index * batch_size: (index + 1) * batch_size] 333 } 334 ) 335 336 # compute the gradient of cost with respect to theta = (W,b) 337 g_W = T.grad(cost=cost, wrt=classifier.W) 338 g_b = T.grad(cost=cost, wrt=classifier.b) 339 340 # start-snippet-3 341 # specify how to update the parameters of the model as a list of 342 # (variable, update expression) pairs. 343 updates = [(classifier.W, classifier.W - learning_rate * g_W), 344 (classifier.b, classifier.b - learning_rate * g_b)] 345 346 # compiling a Theano function `train_model` that returns the cost, but in 347 # the same time updates the parameter of the model based on the rules 348 # defined in `updates` 349 train_model = theano.function( 350 inputs=[index], 351 outputs=cost, 352 updates=updates, 353 givens={ 354 x: train_set_x[index * batch_size: (index + 1) * batch_size], 355 y: train_set_y[index * batch_size: (index + 1) * batch_size] 356 } 357 ) 358 # end-snippet-3 359 360 ############### 361 # TRAIN MODEL # 362 ############### 363 print('... training the model') 364 # early-stopping parameters 365 patience = 5000 # look as this many examples regardless 366 patience_increase = 2 # wait this much longer when a new best is 367 # found 368 improvement_threshold = 0.995 # a relative improvement of this much is 369 # considered significant 370 validation_frequency = min(n_train_batches, patience // 2) 371 # go through this many 372 # minibatche before checking the network 373 # on the validation set; in this case we 374 # check every epoch 375 376 best_validation_loss = numpy.inf 377 test_score = 0. 378 start_time = timeit.default_timer() 379 380 done_looping = False 381 epoch = 0 382 while (epoch < n_epochs) and (not done_looping): 383 epoch = epoch + 1 384 for minibatch_index in range(n_train_batches): 385 386 minibatch_avg_cost = train_model(minibatch_index) 387 # iteration number 388 iter = (epoch - 1) * n_train_batches + minibatch_index 389 390 if (iter + 1) % validation_frequency == 0: 391 # compute zero-one loss on validation set 392 validation_losses = [validate_model(i) 393 for i in range(n_valid_batches)] 394 this_validation_loss = numpy.mean(validation_losses) 395 396 print( 397 'epoch %i, minibatch %i/%i, validation error %f %%' % 398 ( 399 epoch, 400 minibatch_index + 1, 401 n_train_batches, 402 this_validation_loss * 100. 403 ) 404 ) 405 406 # if we got the best validation score until now 407 if this_validation_loss < best_validation_loss: 408 #improve patience if loss improvement is good enough 409 if this_validation_loss < best_validation_loss * \ 410 improvement_threshold: 411 patience = max(patience, iter * patience_increase) 412 413 best_validation_loss = this_validation_loss 414 # test it on the test set 415 416 test_losses = [test_model(i) 417 for i in range(n_test_batches)] 418 test_score = numpy.mean(test_losses) 419 420 print( 421 ( 422 ' epoch %i, minibatch %i/%i, test error of' 423 ' best model %f %%' 424 ) % 425 ( 426 epoch, 427 minibatch_index + 1, 428 n_train_batches, 429 test_score * 100. 430 ) 431 ) 432 433 # save the best model 434 with open('best_model.pkl', 'wb') as f: 435 pickle.dump(classifier, f) 436 437 if patience <= iter: 438 done_looping = True 439 break 440 441 end_time = timeit.default_timer() 442 print( 443 ( 444 'Optimization complete with best validation score of %f %%,' 445 'with test performance %f %%' 446 ) 447 % (best_validation_loss * 100., test_score * 100.) 448 ) 449 print('The code run for %d epochs, with %f epochs/sec' % ( 450 epoch, 1. * epoch / (end_time - start_time))) 451 print(('The code for file ' + 452 os.path.split(__file__)[1] + 453 ' ran for %.1fs' % ((end_time - start_time))), file=sys.stderr) 454 455 456 def predict(): 457 """ 458 An example of how to load a trained model and use it 459 to predict labels. 460 """ 461 462 # load the saved model 463 classifier = pickle.load(open('best_model.pkl')) 464 465 # compile a predictor function 466 predict_model = theano.function( 467 inputs=[classifier.input], 468 outputs=classifier.y_pred) 469 470 # We can test it on some examples from test test 471 dataset='mnist.pkl.gz' 472 datasets = load_data(dataset) 473 test_set_x, test_set_y = datasets[2] 474 test_set_x = test_set_x.get_value() 475 476 predicted_values = predict_model(test_set_x[:10]) 477 print("Predicted values for the first 10 examples in test set:") 478 print(predicted_values) 479 480 481 if __name__ == '__main__': 482 sgd_optimization_mnist()
2、Keras測試
1 '''Trains a simple convnet on the MNIST dataset. 2 Gets to 99.25% test accuracy after 12 epochs 3 (there is still a lot of margin for parameter tuning). 4 16 seconds per epoch on a GRID K520 GPU. 5 ''' 6 7 from __future__ import print_function 8 import numpy as np 9 np.random.seed(1337) # for reproducibility 10 11 from keras.datasets import mnist 12 from keras.models import Sequential 13 from keras.layers import Dense, Dropout, Activation, Flatten 14 from keras.layers import Convolution2D, MaxPooling2D 15 from keras.utils import np_utils 16 17 batch_size = 128 18 nb_classes = 10 19 nb_epoch = 12 20 21 # input image dimensions 22 img_rows, img_cols = 28, 28 23 # number of convolutional filters to use 24 nb_filters = 32 25 # size of pooling area for max pooling 26 nb_pool = 2 27 # convolution kernel size 28 nb_conv = 3 29 30 # the data, shuffled and split between train and test sets 31 (X_train, y_train), (X_test, y_test) = mnist.load_data() 32 33 X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols) 34 X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols) 35 X_train = X_train.astype('float32') 36 X_test = X_test.astype('float32') 37 X_train /= 255 38 X_test /= 255 39 print('X_train shape:', X_train.shape) 40 print(X_train.shape[0], 'train samples') 41 print(X_test.shape[0], 'test samples') 42 43 # convert class vectors to binary class matrices 44 Y_train = np_utils.to_categorical(y_train, nb_classes) 45 Y_test = np_utils.to_categorical(y_test, nb_classes) 46 47 model = Sequential() 48 49 model.add(Convolution2D(nb_filters, nb_conv, nb_conv, 50 border_mode='valid', 51 input_shape=(1, img_rows, img_cols))) 52 model.add(Activation('relu')) 53 model.add(Convolution2D(nb_filters, nb_conv, nb_conv)) 54 model.add(Activation('relu')) 55 model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool))) 56 model.add(Dropout(0.25)) 57 58 model.add(Flatten()) 59 model.add(Dense(128)) 60 model.add(Activation('relu')) 61 model.add(Dropout(0.5)) 62 model.add(Dense(nb_classes)) 63 model.add(Activation('softmax')) 64 65 model.compile(loss='categorical_crossentropy', 66 optimizer='adadelta', 67 metrics=['accuracy']) 68 69 model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, 70 verbose=1, validation_data=(X_test, Y_test)) 71 score = model.evaluate(X_test, Y_test, verbose=0) 72 print('Test score:', score[0]) 73 print('Test accuracy:', score[1])
參考鏈接:
2. 小白Windows7/10 64Bit安裝Theano並實現GPU加速(沒有MinGw等,詳細步驟)
3. https://bitbucket.org/pypa/distlib/issues/47/exe-launcher-fails-if-there-is-a-space-in
4. http://stackoverflow.com/questions/24627525/fatal-error-in-launcher-unable-to-create-process-using-c-program-files-x86/26428562#26428562
5. Theano 安裝教程
6. Installation of Theano on Windows
7. http://stackoverflow.com/questions/33687103/how-to-install-theano-on-anaconda-python-2-7-x64-on-windows?noredirect=1&lq=1
8. Keras官方教程
9. Theano官方教程