在模型選擇中我們一般用caret包train函數建立模型,並對模型進行評判
方法1:
set.seed(1234) tr_control<-trainControl(method = 'cv',number = 5) # 創建隨機森林模型 model_rf<-train(Class~.,data=traindata, trControl=tr_control,method='rf') model_rf
輸出
mtry Accuracy Kappa
2 0.9276465 0.8552977
16 0.9314521 0.8628921
30 0.9276627 0.8553120
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 16.
方法2
set.seed(1234) model_rf <- train(Class ~., data = traindata, method = 'rf', trControl = trainControl(method = 'cv', number = 5, selectionFunction = 'oneSE')) model_rf
mtry Accuracy Kappa
2 0.9276143 0.8552365
16 0.9212771 0.8425685
30 0.9250988 0.8502003
Accuracy was used to select the optimal model using the one SE rule.
The final value used for the model was mtry = 2.
可以看到二者選定的模型並不一樣,而且選定的標准也不一樣,方法1標准是最大值法,方法2是精確度。
原因在方法2中用了:selectionFunction = 'oneSE'