Partial Dependence Plots:部分依赖图
模型训练后(fit),才可以创建部分依赖图(PDP)
部分依赖图,反映出了某一列特征,对目标列(target)的影响
例子:
from sklearn.ensemble.partial_dependence import partial_dependence, plot_partial_dependence
# get_some_data is defined in hidden cell above.
X, y = get_some_data()
# scikit-learn originally implemented partial dependence plots only for Gradient Boosting models
# this was due to an implementation detail, and a future release will support all model types.
my_model = GradientBoostingRegressor()
# fit the model as usual
my_model.fit(X, y)
# Here we make the plot
my_plots = plot_partial_dependence(my_model,
features=[0,1,2], # column numbers of plots we want to show
X=X, # raw predictors data.
feature_names=['Distance', 'Landsize', 'BuildingArea'], # labels on graphs
grid_resolution=10) # number of values to plot on x axis 代表横坐标上点的数量,越大表示图上的点越多,默认为100
部分依赖图(PDP)使用提示:
一次最多显示2-3个变量,太多了看不清
grid_resolution参数不要太大,否则图形的锯齿状明显,可以像例子一样设置为10
还有一个方法叫partial_dependence,这个可以返回构成PDP的原始数据,可以用Seaborn等图形包画出更好看的图