5.1 Cost Function
假設訓練樣本為:{(x1),y(1)),(x(2),y(2)),...(x(m),y(m))}
L = total no.of layers in network
sL= no,of units(not counting bias unit) in layer L
K = number of output units/classes
如圖所示的神經網絡,L = 4,s1 = 3,s2 = 5,s3 = 5, s4 = 4
邏輯回歸的代價函數:
神經網絡的代價函數:
5.2 反向傳播算法 Backpropagation
關於反向傳播算法的一篇通俗的解釋http://blog.csdn.net/shijing_0214/article/details/51923547
5.3 Training a neural network
隱藏層的單元數一般一樣,隱藏層一般越多越好,但計算量會較大。
Training a Neural Network
- Randomly initialize the weights
- Implement forward propagation to get hΘ(x(i)) for any x(i)
- Implement the cost function
- Implement backpropagation to compute partial derivatives
- Use gradient checking to confirm that your backpropagation works. Then disable gradient checking.
- Use gradient descent or a built-in optimization function to minimize the cost function with the weights in theta.