轉載請注明出處:
http://www.cnblogs.com/darkknightzh/p/7608916.html
參考網址:
https://stackoverflow.com/questions/39758094/clearing-tensorflow-gpu-memory-after-model-execution
https://github.com/tensorflow/tensorflow/issues/1727#issuecomment-285815312s
tensorflow中,在一個函數內配置完GPU,tf分配了顯存,等函數執行完,顯存不會釋放(貌似torch7中也一樣。。。)。第二個參考網址指出:
As for the original problem, currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down. Even if a second session chooses a different GPUOptions, it would not take effect.
第一個session對GPU初始化后,即便釋放了顯存,第二個sess使用不同的GPU選項來初始化GPU,也不會起效。
第一個網址Oli Blum指出,use processes and shut them down after the computation才能釋放顯存。具體代碼如下(可以參考第一個網址):
1 import tensorflow as tf 2 import multiprocessing 3 import numpy as np 4 5 def run_tensorflow(): 6 7 n_input = 10000 8 n_classes = 1000 9 10 # Create model 11 def multilayer_perceptron(x, weight): 12 # Hidden layer with RELU activation 13 layer_1 = tf.matmul(x, weight) 14 return layer_1 15 16 # Store layers weight & bias 17 weights = tf.Variable(tf.random_normal([n_input, n_classes])) 18 19 20 x = tf.placeholder("float", [None, n_input]) 21 y = tf.placeholder("float", [None, n_classes]) 22 pred = multilayer_perceptron(x, weights) 23 24 cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) 25 optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost) 26 27 init = tf.global_variables_initializer() 28 29 with tf.Session() as sess: 30 sess.run(init) 31 32 for i in range(100): 33 batch_x = np.random.rand(10, 10000) 34 batch_y = np.random.rand(10, 1000) 35 sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y}) 36 37 print "finished doing stuff with tensorflow!" 38 39 40 if __name__ == "__main__": 41 42 # option 1: execute code with extra process 43 p = multiprocessing.Process(target=run_tensorflow) 44 p.start() 45 p.join() 46 47 # wait until user presses enter key 48 raw_input() 49 50 # option 2: just execute the function 51 run_tensorflow() 52 53 # wait until user presses enter key 54 raw_input()
使用multiprocessing.Process運行run_tensorflow后,顯存會自動釋放,但是如果直接執行run_tensorflow,顯存不會自動釋放。當然,該函數計算量較小,如果顯卡太好,可能看不到運行multiprocessing.Process后,顯存分配、計算並釋放的過程,感覺就像沒有運行一樣。。。