svmrank 的誤差懲罰因子c選擇經驗

本文轉載自查看原文 2014-01-10 11:22 3431

C是一個由用戶去指定的系數，表示對分錯的點加入多少的懲罰，當C很大的時候，分錯的點就會更少，但是過擬合的情況可能會比較嚴重，當C很小的時候，分錯的點可能會很多，不過可能由此得到的模型也會不太正確，所以如何選擇C是有很多學問的，不過在大部分情況下就是通過經驗嘗試得到的。

Trade-off between Maximum Margin and Classification Errors

http://mi.eng.cam.ac.uk/~kkc21/thesis_main/node29.html

The trade-off between maximum margin and the classification error (during training) is defined by the value C in Eqn. . The value C is called the Error Penalty. A high error penalty will force the SVM training to avoid classification errors (Section gives a brief overview of the significance of the value of C).

A larger C will result in a larger search space for the QP optimiser. This generally increases the duration of the QP search, as results in Table show. Other experiments with larger numbers of data points (1200) fail to converge whenC is set higher than 1000. This is mainly due to numerical problems. The cost function of the QP does not decrease monotonically . A larger search space does contribute to these problems.

The number of SVs does not change significantly with different C value. A smaller C does cause the average number of SVs to increases slightly. This could be due to more support vectors being needed to compensate the bound on the other support vectors. The norm of w decreases with smaller C. This is as expected, because if errors are allowed, then the training algorithm can find a separating plane with much larger margin. Figures , , and show the decision boundaries for two very different error penalties on two classifiers (2-to-rest and 5-to-rest). It is clear that with higher error penalty, the optimiser gives a boundary that classifies all the training points correctly. This can give very irregular boundaries.

One can easily conclude that the more regular boundaries (Figures and ) will give better generalisation. This conclusion is also supported by the value of ||w|| which is lower for these two classifiers, i.e. they have larger margin. One can also use the expected error bound to predict the best error penalty setting. First the expected error bound is computed using Eqn. and ( ). This is shown in Figure . It predicts that the best setting isC=10 and C=100. The accuracy obtained from testing data (Figure ) agrees with this prediction.

table1242

所以c一般選用10，100

實測：

用svm_rank測試數據時，

經驗參數，c=1，效果不如c=3.
故c=1,放棄。

但c=1 訓練時間比c=3訓練時間短。

總的來說，c越大，svm_rank learn的迭代次數越大，所耗訓練時間越長。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 svm-懲罰因子《機器學習(周志華)》筆記--模型的評估與選擇（1）--經驗誤差與擬合、偏差與方差經驗誤差、測試誤差、泛化誤差及其偏差-方差分解 C++ —— 求因子基於C#的機器學習--懲罰與獎勵-強化學習 HashMap默認加載因子為什么選擇0.75？(阿里) 【前輩經驗】—— 我為什么選擇React而不選擇Vue？拓端數據tecdat|R語言懲罰logistic邏輯回歸（LASSO,嶺回歸）高維變量選擇的分類模型案例外點懲罰函數、內點懲罰函數藍牙模塊選擇經驗談

svmrank 的誤差懲罰因子c選擇 經驗

Trade-off between Maximum Margin and Classification Errors

免責聲明！

svmrank 的誤差懲罰因子c選擇經驗