Python如何進行cross validation training


以4-fold validation training為例

(1) 給定數據集data和標簽集label

樣本個數為

sampNum = len(data)

(2) 將給定的所有examples分為10組

每個fold個數為

foldNum = sampNum/10  

(3) 將給定的所有examples分為10組

參考scikit-learn的3.1節:Cross-validation 

 1 import np
 2 from sklearn import cross_validation
 3 # dataset
 4 
 5 data = np.array([[1,3],[2,4],[3.1,3],[4,5],[5.0,0.3],[4.1,3.1]])
 6 label = np.array([0,1,1,1,0,0])
 7 sampNum= len(data)
 8 
 9 # 10-fold (9份為training,1份為validation)
10 kf = KFold(len(data), n_folds=4)
11 iFold = 0
12 for train_index, val_index in kf:
13     iFold = iFold+1
14     X_train, X_val, y_train, y_val = data[train_index], data[val_index], label[train_index], label[val_index] # 這里的X_train,y_train為第iFold個fold的訓練集,X_val,y_val為validation set

 

  

給定的數據集如下: 

   

 

所有樣本的指標集為:

0 1 2 3 4 5 6 7

每個iFold(共4個)的訓練集和validation set的index分別為:

 iFold = 0 (訓練集中包含6個examples,validation set 中包含3個examples)

iFold = 1

iFold = 2

iFold = 3

每個iFold的訓練集和validation set分別為:

X_train, X_val, y_train, y_val = data[train_index], data[val_index], label[train_index], label[val_index]

  

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM