最近幾個月一直在和幾個小伙伴做Deep Learning相關的事情。除了像tensorflow,gpu這些框架或工具之外,最大的收獲是思路上的,Neural Network相當富余變化,發揮所想、根據手頭的數據和問題去設計創新吧。今天聊一個Unsupervised Learning的NN:Autoencoder。
Autoencoder的特點是:首先,數據中只有X,沒有y;此外,輸入和輸出的nodes數量相同,可以把其定義為用神經網絡對input data壓縮、取精華后的重構。其結構圖如下:
聽起來蠻抽象,但從其architecture上面,來看,首先是全連接(fully-connected network)。和Feed-forward NN的重點不同是,FFNN的neurons per layer逐層遞減,而Autoencoder則是neurons per layer先減少,這部分叫做Encode,中間存在一個瓶頸層,然后再逐漸放大至原先的size,這部分叫做Decode。然后在output layer,希望輸出和輸入一致,以此思路構建loss function (MSE),然后再做Back-propagation。
工作流圖如下:
數據集用的是MNIST手寫數字庫,encode有3層,decode有3層,核心代碼可見最下方。壓縮並還原之后,得出的圖片對比如下,可見Autoencoder雖然在bottleneck處將數據壓縮了很多,但經過decode之后,基本是可以還原圖片數據的 :
當然,如果壓縮再解壓,得到差不多的圖片,其實意義不大,那我們考慮在訓練結束后,只用encode的部分,即Autoencoder的前半部來給數據做降維,那么會得到什么結果呢?在這個例子中,為了更好地把數據降到2維,我加了另外2層hidden layer,並且bottleneck層移除了activation function,得到的結果如下:
可以看到,數據被從784維空間中壓縮到2維,並且做了類似clustering的操作。
# Parameter learning_rate = 0.001 training_epochs = 50 batch_size = 256 display_step = 1 examples_to_show = 10 # Network Parameters n_input = 784 # MNIST data input (img shape: 28*28) # hidden layer settings n_hidden_1 = 256 # 1st layer num features n_hidden_2 = 128 # 2nd layer num features n_hidden_3 = 64 # 3rd layer num features X = tf.placeholder(tf.float32, [None,n_input]) weights = { 'encoder_h1':tf.Variable(tf.random_normal([n_input,n_hidden_1])), 'encoder_h2': tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2])), 'encoder_h3': tf.Variable(tf.random_normal([n_hidden_2,n_hidden_3])), 'decoder_h1': tf.Variable(tf.random_normal([n_hidden_3,n_hidden_2])), 'decoder_h2': tf.Variable(tf.random_normal([n_hidden_2,n_hidden_1])), 'decoder_h3': tf.Variable(tf.random_normal([n_hidden_1, n_input])), } biases = { 'encoder_b1': tf.Variable(tf.random_normal([n_hidden_1])), 'encoder_b2': tf.Variable(tf.random_normal([n_hidden_2])), 'encoder_b3': tf.Variable(tf.random_normal([n_hidden_3])), 'decoder_b1': tf.Variable(tf.random_normal([n_hidden_2])), 'decoder_b2': tf.Variable(tf.random_normal([n_hidden_1])), 'decoder_b3': tf.Variable(tf.random_normal([n_input])), } # Building the encoder def encoder(x): # Encoder Hidden layer with sigmoid activation #1 layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']), biases['encoder_b1'])) # Decoder Hidden layer with sigmoid activation #2 layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['encoder_h2']), biases['encoder_b2'])) layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['encoder_h3']), biases['encoder_b3'])) return layer_3 # Building the decoder def decoder(x): # Encoder Hidden layer with sigmoid activation #1 layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['decoder_h1']), biases['decoder_b1'])) # Decoder Hidden layer with sigmoid activation #2 layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['decoder_h2']), biases['decoder_b2'])) layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['decoder_h3']), biases['decoder_b3'])) return layer_3 # Construct model encoder_op = encoder(X) # 128 Features decoder_op = decoder(encoder_op) # 784 Features # Prediction y_pred = decoder_op # After # Targets (Labels) are the input data. y_true = X # Before # Define loss and optimizer, minimize the squared error cost = tf.reduce_mean(tf.pow(y_true - y_pred, 2)) optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost) # Launch the graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) total_batch = int(mnist.train.num_examples/batch_size) # Training cycle for epoch in range(training_epochs): # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # max(x) = 1, min(x) = 0 # Run optimization op (backprop) and cost op (to get loss value) _, c = sess.run([optimizer, cost], feed_dict={X: batch_xs}) # Display logs per epoch step if epoch % display_step == 0: print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c)) print("Optimization Finished!")