XGBoost、LightGBM參數講解及實戰

本文轉載自查看原文 2019-09-25 18:37 445 機器學習/ LGBM/ XGBoost

本文鏈接：https://blog.csdn.net/linxid/article/details/80785131
XGBoost
一、API詳解
xgboost.XGBClassifier
1.1 參數
1.1.1 通用參數：
booster=‘gbtree’ 使用的提升數的種類 gbtree, gblinear or dart
silent=True: 訓練過程中是否打印日志
n_jobs=1: 並行運行的多線程數
1.1.2 提升樹參數
learning_rate=0.1: 訓練的學習率，和梯度下降差不多
max_depth=3: 樹的最大深度
gamma=0
n_estimators=100: 要擬合的樹的棵樹，可以認為是訓練輪數
min_child_weight=1: 葉結點的最小權重
subsample=1: 訓練樣本的抽樣比率，行索引
colsample_bytree=1: 特征的抽樣比率，列索引
reg_alpha=0: L1正則化系數
reg_lambda=1: L2正則化系數
1.1.3 學習任務參數
objective=‘binary:logistic’ 確定學習任務和相應的學習函數
"reg:linear" -線性回歸
"reg:logistic" -邏輯回歸
"binary:logistic" -二分類邏輯回歸，輸出概率
"binary:logitraw" -二分類邏輯回歸，輸出未logistic變換前的得分

"multi:softmax"
"multi:softprob"

random_state=0: 隨機種子數

missing=None: 缺失值處理辦法

max_delta_step=0,

colsample_bylevel=1

scale_pos_weight=1,

base_score=0.5,

nthread=None: 棄用，改用n_jobs

seed=None：棄用，改用random_state

1.1.4 控制過擬合：
降低模型復雜度：max_depth, min_child_weight and gamma
對樣本隨機采樣：subsample, colsample_bytree
降低學習率，同時相應提高訓練輪數
1.2 方法：
1.2.1 fit
X：特征矩陣

y: 標簽

sample_weight=None: 沒一個樣本的權重

eval_set=None: (X,y)驗證集，用於檢測提前結束訓練

eval_metric=None: 評價指標

"rmse"
"mae"
"logloss"
"error":二分類錯誤率，閾值是0.5
"error@t":和error類似，閾值為t
"mlogloss"
"auc"

early_stopping_rounds=None: 提前結束輪數
verbose=True,
xgb_model=None,
sample_weight_eval_set=None
1.2.2 predict(data, output_margin=False, ntree_limit=0)
返回預測類別，數據類型np.array，閾值不好控制

1.2.3 predict_proba(data, ntree_limit=0)
預測每一個數據，成為給定類別的概率

二、實例學習如何使用Xgboost
https://github.com/dmlc/xgboost/tree/master/demo

LightGBM
一、API詳解
lightgbm.LGBMClassifier
參數 XGBoost CatBoost Lightgbm
模型參數 boosting_type=‘gbdt’(gbdt,dart,goss,rf)
num_leaves=31
max_depth=-1(no limit)
n_estimators=100
learning_rate=0.1
objective=(regression,binary/multiclass)
class_weight=()
subsample=1 訓練樣本采樣率(行)
colsample_bytree=1 訓練特征采樣率(列)
lambda_l1=0：L1正則化系數
lambda_l2=0.0: L2正則化系數
random_state=None: 隨機種子數
n_jobs=-1: 多線程數
max_bin=255
metric
fit X,y
eval_set=None([(X_train, y_train), (X_valid, y_valid)])
early_stopping_rounds=None
categorical_feature=‘auto’
verbose=True
eval_metric=None
metrics l1/mae
l2/mse(regression)
l2_root/rmse
binary_logloss(bi-classification)
auc
multi_logloss
參考資料：
1. XGBoost和LightGBM的參數以及調參
2. Xgboost參數調優的完整指南及實戰
3. LightGBM調參筆記
4. LightGBM 調參方法（具體操作）

————————————————
版權聲明：本文為CSDN博主「linxid」的原創文章，遵循 CC 4.0 BY-SA 版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/linxid/article/details/80785131

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 融合模型Stacking：Catboost、Xgboost、LightGBM、Adaboost、RF etc XGBoost入門及實戰 Xgboost參數調節 XGBoost 參數介紹超參數調整實戰：scikit-learn配合XGBoost的競賽top20策略【集成學習】lightgbm參數介紹（sklearn） lightGBM Python API參考以及各參數意義 xgboost中XGBClassifier(）參數詳解集成學習實戰——Boosting（GBDT，Adaboost，XGBoost） xgboost入門與實戰（原理篇）