原創轉載請注明出處:https://www.cnblogs.com/agilestyle/p/12719231.html
准備數據
這里使用到 sklearn 自帶的波士頓房價數據集,該數據集給出了影響房價的一些指標,比如犯罪率,房產稅等,最后給出了房價。根據這些指標,使用 CART 回歸樹對波士頓房價進行預測。
from sklearn.datasets import load_boston from sklearn.metrics import mean_squared_error, mean_absolute_error from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeRegressor boston = load_boston() features = boston.data labels = boston.target # (506, 13) features.shape # (506,) labels.shape
分割訓練集和測試集
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.33, random_state=0)
建模訓練
dtr = DecisionTreeRegressor() # DecisionTreeRegressor(criterion='mse', max_depth=None, max_features=None, # max_leaf_nodes=None, min_impurity_decrease=0.0, # min_impurity_split=None, min_samples_leaf=1, # min_samples_split=2, min_weight_fraction_leaf=0.0, # presort=False, random_state=None, splitter='best') dtr.fit(X_train, y_train)
評價模型
predict_price = dtr.predict(X_test) print('回歸樹二乘偏差均值:', mean_squared_error(y_test, predict_price)) print('回歸樹絕對值偏差均值:', mean_absolute_error(y_test, predict_price))
運行結果(每次運行結果可能會有不同)
回歸樹二乘偏差均值: 24.67646706586826 回歸樹絕對值偏差均值: 3.1670658682634736
決策樹可視化
from sklearn.tree import export_graphviz with open('boston.dot', 'w') as f: f = export_graphviz(dtr, out_file=f)
如果把回歸樹畫出來,可以得到下面的圖示(波士頓房價數據集的指標有些多,所以樹比較大):
Reference
https://time.geekbang.org/column/article/78659