Lasoo 與ridge regression 區別

本文轉載自查看原文 2018-03-26 04:58 3808

lasso 也叫L1正則化懲罰系數的絕對值

　　ridge 也叫L2正則化懲罰系數的平方

ridge 懲罰后每個系數都收縮

　　lasso 懲罰后，有的系數直接變成0 其他系數收縮

LASSO: least absolute selection and shrinkage operator

　　lasso 有變量選擇的功能

===============

共同點為：

(1) 當截距項存在時，都不懲罰截距項

beta_0 = mean(y)

(2) 都是有偏的

(3) 都要把系數scale后，再進行懲罰，因為 sum || beta||時，要保證fair

=============

關於bias 和variance

bias of lasso estimate increasing as lambda increasing

variance of lasso estimate increasing as lambda increasing

=============

關於預測誤差的討論

(1) 有說類似：In terms of prediction error (or mean squared error), the lasso
performs comparably to ridge regression.

(2) 有說ridge更好：“Typically ridge or ℓ2 penalties are **much better** for minimizing prediction error rather than ℓ1 penalties. The reason for this is that when two predictors are highly correlated, ℓ1 regularizer will simply pick one of the two predictors. In contrast, the ℓ2 regularizer will keep both of them and jointly shrink the corresponding coefficients a little bit. Thus, while the ℓ1 penalty can certainly reduce overfitting, you may also experience a loss in predictive power.”

===================

拓展

Bayesian Lasso

當在貝葉斯框架下考慮問題，參數的先驗分布選擇了laplace分布，則最大化后驗概率導出的目標函數為lasso形式。

Bayesian Ridge

參數的先驗分布選擇了正態分布，則最大化后驗概率導出的目標函數為ridge形式。

見 https://www.zhihu.com/question/23536142

進階閱讀

Trevor Park & George Casella (2008) The Bayesian Lasso, Journal of the
American Statistical Association, 103:482, 681-686, DOI: 10.1198/016214508000000337

=================

應用

The lasso, Bayesian lasso, and extensions can be done using the
【monomvn package in R】.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Ridge Regression嶺回歸 Kernel ridge regression（KRR）再談Lasso回歸 | elastic net | Ridge Regression 線性回歸——lasso回歸和嶺回歸（ridge regression） ISLR系列：(4.2)模型選擇 Ridge Regression & the Lasso Sklearn庫例子3：分類——嶺回歸分類（Ridge Regression ）例子 scikit-learn中的嶺回歸（Ridge Regression）與Lasso回歸 L1,L2范數和正則化到lasso ridge regression Linear Regression Multiple Regression