TensorFlow本身是分布式機器學習框架,所以是基於深度學習的,前一篇TensorFlow簡易學習[2]:實現線性回歸對只一般算法的舉例只是為說明TensorFlow的廣泛性。本文將通過示例TensorFlow如何創建、訓練一個神經網絡。
主要包括以下內容:
神經網絡基礎
基本激勵函數
創建神經網絡
神經網絡簡介
關於神經網絡資源很多,這里推薦吳恩達的一個Tutorial。
基本激勵函數
關於激勵函數的作用,常有解釋:不使用激勵函數的話,神經網絡的每層都只是做線性變換,多層輸入疊加后也還是線性變換。因為線性模型的表達能力不夠,激勵函數可以引入非線性因素(ref1)。 關於如何選擇激勵函數,激勵函數的優缺點等可參考已標識ref1, ref2。
常用激勵函數有(ref2): tanh, relu, sigmod, softplus
激勵函數在TensorFlow代碼實現:
#!/usr/bin/python ''' Show the most used activation functions in Network ''' import tensorflow as tf import numpy as np import matplotlib.pyplot as plt x = np.linspace(-5, 5, 200) #1. struct #following are popular activation functions y_relu = tf.nn.relu(x) y_sigmod = tf.nn.sigmoid(x) y_tanh = tf.nn.tanh(x) y_softplus = tf.nn.softplus(x) #2. session sess = tf.Session() y_relu, y_sigmod, y_tanh, y_softplus =sess.run([y_relu, y_sigmod, y_tanh, y_softplus]) # plot these activation functions plt.figure(1, figsize=(8,6)) plt.subplot(221) plt.plot(x, y_relu, c ='red', label = 'y_relu') plt.ylim((-1, 5)) plt.legend(loc = 'best') plt.subplot(222) plt.plot(x, y_sigmod, c ='b', label = 'y_sigmod') plt.ylim((-1, 5)) plt.legend(loc = 'best') plt.subplot(223) plt.plot(x, y_tanh, c ='b', label = 'y_tanh') plt.ylim((-1, 5)) plt.legend(loc = 'best') plt.subplot(224) plt.plot(x, y_softplus, c ='c', label = 'y_softplus') plt.ylim((-1, 5)) plt.legend(loc = 'best') plt.show()
結果:
創建神經網絡
創建層
定義函數用於創建隱藏層/輸出層:
#add a layer and return outputs of the layer def add_layer(inputs, in_size, out_size, activation_function=None): #1. initial weights[in_size, out_size] Weights = tf.Variable(tf.random_normal([in_size,out_size])) #2. bias: (+0.1) biases = tf.Variable(tf.zeros([1,out_size]) + 0.1) #3. input*Weight + bias Wx_plus_b = tf.matmul(inputs, Weights) + biases #4. activation if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs
定義網絡結構
此處定義一個三層網絡,即:輸入-單層隱藏層-輸出層。可通過以上函數添加層數。網絡為全連接網絡。
# add hidden layer l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu) # add output layer prediction = add_layer(l1, 10, 1, activation_function=None)
訓練
利用梯度下降,訓練1000次。
loss function: suqare error loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) GD = tf.train.GradientDescentOptimizer(0.1) train_step = GD.minimize(loss)
完整代碼
#!/usr/bin/python ''' Build a simple network ''' import tensorflow as tf import numpy as np #1. add_layer def add_layer(inputs, in_size, out_size, activation_function=None): #1. initial weights[in_size, out_size] Weights = tf.Variable(tf.random_normal([in_size,out_size])) #2. bias: (+0.1) biases = tf.Variable(tf.zeros([1,out_size]) + 0.1) #3. input*Weight + bias Wx_plus_b = tf.matmul(inputs, Weights) + biases #4. activation ## when activation_function is None then outlayer if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs ##begin build network struct## ##network: 1 * 10 * 1 #2. create data x_data = np.linspace(-1, 1, 300)[:, np.newaxis] noise = np.random.normal(0, 0.05, x_data.shape) y_data = np.square(x_data) - 0.5 + noise #3. placehoder: waiting for the training data xs = tf.placeholder(tf.float32, [None, 1]) ys = tf.placeholder(tf.float32, [None, 1]) #4. add hidden layer h1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu) h2 = add_layer(h1, 10, 10, activation_function=tf.nn.relu) #5. add output layer prediction = add_layer(h2, 10, 1, activation_function=None) #6. loss function: suqare error loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) GD = tf.train.GradientDescentOptimizer(0.1) train_step = GD.minimize(loss) ## End build network struct ### ## Initial the variables if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables() else: init = tf.global_variables_initializer() ## Session sess = tf.Session() sess.run(init) # called in the visual ## Traing for step in range(1000): #當運算要用到placeholder時,就需要feed_dict這個字典來指定輸入 sess.run(train_step, feed_dict={xs:x_data, ys:y_data}) if i % 50 == 0: # to visualize the result and improvement try: ax.lines.remove(lines[0]) except Exception: pass prediction_value = sess.run(prediction, feed_dict={xs: x_data}) # plot the prediction lines = ax.plot(x_data, prediction_value, 'r-', lw=5) plt.pause(1) sess.close()
結果:
至此TensorFlow簡易學習完結。
--------------------------------------
說明:本列為前期學習時記錄,為基本概念和操作,不涉及深入部分。文字部分參考在文中注明,代碼參考莫凡