1.了解知道Dropout原理
深度學習網路中,參數多,可能出現過擬合及費時問題。為了解決這一問題,通過實驗,在2012年,Hinton在其論文《Improving neural networks by preventing co-adaptation of feature detectors》中提出Dropout。證明了其能有效解決過擬合的能力。
dropout 是指在深度學習網絡的訓練過程中,按照一定的概率將一部分神經網絡單元暫時從網絡中丟棄,相當於從原始的網絡中找到一個更瘦的網絡示意圖如下:
其實現是以某種概率分布使得一些神經元為0,一些為1.這樣在有N個神經元的神經網絡中,其參數搭配可能有2^N種。
具體介紹 見論文(我也不是很懂 實現得見)
適用情況:
1 Dropout主要用在數據量不夠,容易過擬合,需要dropout。
L1及L2可以使得結構化風險最小
其中:
L1的參數具有稀疏性(具有更多的0或1)
L2的參數趨近於分散化 ,其參數值趨向於選擇更簡單(趨於0的參數),因此比較平滑
2.用代碼實現正則化(L1、L2、Dropout)
L1范數
L1范數是參數矩陣W中元素的絕對值之和,L1范數相對於L0范數不同點在於,L0范數求解是NP問題,而L1范數是L0范數的最優凸近似,求解較為容易。L1常被稱為LASSO.
1 regularization_loss = 0 2 for param in model.parameters(): 3 regularization_loss += torch.sum(abs(param)) 4 5 for epoch in range(EPOCHS): 6 y_pred = model(x_train) 7 classify_loss = criterion(y_pred, y_train.float().view(-1, 1)) 8 loss = classify_loss + 0.001 * regularization_loss # 引入L1正則化項
L2范數
L2范數是參數矩陣W中元素的平方之和,這使得參數矩陣中的元素更稀疏,與前兩個范數不同的是,它不會讓參數變為0,而是使得參數大部分都接近於0。L1追求稀疏化,從而丟棄了一部分特征(參數為0),而L2范數只是使參數盡可能為0,保留了特征。L2被稱為Rigde.
1 criterion = torch.nn.BCELoss() #定義損失函數 2 optimizer = torch.optim.SGD(model.parameters(),lr = 0.01, momentum=0, dampening=0,weight_decay=0) #weight_decay 表示使用L2正則化
3.Dropout的numpy實現
1 import numpy as np 2 3 X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ]) 4 5 y = np.array([[0,1,1,0]]).T 6 7 alpha,hidden_dim,dropout_percent,do_dropout = (0.5,4,0.2,True) 8 9 synapse_0 = 2*np.random.random((3,hidden_dim)) - 1 10 11 synapse_1 = 2*np.random.random((hidden_dim,1)) - 1 12 13 for j in xrange(60000): 14 15 layer_1 = (1/(1+np.exp(-(np.dot(X,synapse_0))))) 16 17 if(do_dropout): 18 19 layer_1 *= np.random.binomial([np.ones((len(X),hidden_dim))],1-dropout_percent)[0] * (1.0/(1-dropout_percent)) 20 21 layer_2 = 1/(1+np.exp(-(np.dot(layer_1,synapse_1)))) 22 23 layer_2_delta = (layer_2 - y)*(layer_2*(1-layer_2)) 24 25 layer_1_delta = layer_2_delta.dot(synapse_1.T) * (layer_1 * (1-layer_1)) 26 27 synapse_1 -= (alpha * layer_1.T.dot(layer_2_delta)) 28 29 synapse_0 -= (alpha * X.T.dot(layer_1_delta))
4.完整代碼
1 import torch 2 from torch import nn 3 from torch.autograd import Variable 4 import torch.nn.functional as F 5 import torch.nn.init as init 6 import math 7 from sklearn import datasets 8 from sklearn.model_selection import train_test_split 9 from sklearn.metrics import classification_report 10 import numpy as np 11 import pandas as pd 12 %matplotlib inline 13 14 # 導入數據 15 data = pd.read_csv(r'C:\Users\betty\Desktop\pytorch學習\data.txt') 16 x, y = data.ix[:,:8],data.ix[:,-1] 17 18 #測試集為30%,訓練集為80% 19 x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0) 20 21 x_train = Variable(torch.from_numpy(np.array(x_train)).float()) 22 y_train = Variable(torch.from_numpy(np.array(y_train).reshape(-1, 1)).float()) 23 24 x_test = Variable(torch.from_numpy(np.array(x_test)).float()) 25 y_test= Variable(torch.from_numpy(np.array(y_test).reshape(-1,1)).float()) 26 27 28 print(x_train.data.shape) 29 print(y_train.data.shape) 30 31 print(x_test.data.shape) 32 print(y_test.data.shape) 33 34 class Model(torch.nn.Module): 35 def __init__(self): 36 super(Model, self).__init__() 37 self.l1 = torch.nn.Linear(8, 200) 38 self.l2 = torch.nn.Linear(200, 50) 39 self.l3 = torch.nn.Linear(50, 1) 40 41 def forward(self, x): 42 out1 = F.relu(self.l1(x)) 43 out2 = F.dropout(out1, p= 0.5) 44 out3 = F.relu(self.l2(out2)) 45 out4 = F.dropout(out3, p=0.5) 46 y_pred = F.sigmoid(self.l3(out3)) 47 return y_pred 48 49 model = Model() 50 51 criterion = torch.nn.BCELoss() 52 optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.1) 53 54 Loss=[] 55 for epoch in range(2000): 56 y_pred = model(x_train) 57 loss = criterion(y_pred, y_train) 58 if epoch % 400 == 0: 59 print("epoch =", epoch, "loss", loss.item()) 60 Loss.append(loss.item()) 61 optimizer.zero_grad() 62 loss.backward() 63 optimizer.step() 64 65 # 模型評估 66 def label_flag(data): 67 for i in range(len(data)): 68 if(data[i]>0.5): 69 data[i] = 1.0 70 else: 71 data[i] = 0.0 72 return data 73 74 y_pred = label_flag(y_pred) 75 print(classification_report(y_train.detach().numpy(), y_pred.detach().numpy())) 76 77 # 測試 78 y_test_pred = model(x_test) 79 y_test_pred = label_flag(y_test_pred) 80 print(classification_report(y_test.detach().numpy(), y_test_pred.detach().numpy()))
數據集下載鏈接:鏈接:https://pan.baidu.com/s/1LrJktjVQ1OM9mYt_cuE-FQ
提取碼:hatv
原文鏈接:https://blog.csdn.net/wehung/article/details/89283583