先看代碼(sklearn的示例代碼):
- from sklearn.neural_network import MLPClassifier
- X = [[0., 0.], [1., 1.]]
- y = [0, 1]
- clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
- hidden_layer_sizes=(5, 2), random_state=1)
- clf.fit(X, y)
- print 'predict\t',clf.predict([[2., 2.], [-1., -2.]])
- print 'predict\t',clf.predict_proba([[2., 2.], [1., 2.]])
- print 'clf.coefs_ contains the weight matrices that constitute the model parameters:\t',[coef.shape for coef in clf.coefs_]
- print clf
- c=0
- for i in clf.coefs_:
- c+=1
- print c,len(i),i
說明:
MLPclassifier,MLP 多層感知器的的縮寫(Multi-layer Perceptron)
fit(X,y) 與正常特征的輸入輸出相同
solver='lbfgs', MLP的求解方法:L-BFGS 在小數據上表現較好,Adam 較為魯棒,SGD在參數調整較優時會有最佳表現(分類效果與迭代次數);
SGD標識隨機梯度下降。疑問:SGD與反向傳播算法的關系
alpha:L2的參數:MLP是可以支持正則化的,默認為L2,具體參數需要調整
hidden_layer_sizes=(5, 2) hidden層2層,第一層5個神經元,第二層2個神經元)
計算的時間復雜度(非常高。。。。):
Suppose there are n training samples, m features, k hidden layers, each containing h neurons - for simplicity, and o output neurons. The time complexity of backpropagation is O(n\cdot m \cdot h^k \cdot o \cdot i), where i is the number of iterations. Since backpropagation has a high time complexity, it is advisable to start with smaller number of hidden neurons and few hidden layers for training.
涉及到的設置:隱藏層數量k,每層神經元數量h,迭代次數i。