1. 報錯詳情¶
現象:graph.view()展示的圖形顯示中文為亂碼。
In [40]:
from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
wine = load_wine()
Xtrain, Xtest, Ytrain, Ytest = train_test_split(wine.data,wine.target,test_size=0.3)
clf = tree.DecisionTreeClassifier(criterion="entropy")
clf = clf.fit(Xtrain, Ytrain)
score = clf.score(Xtest, Ytest)
feature_name = ['酒精','蘋果酸','灰','灰的鹼性','鎂','總酚','類黃酮','非黃烷類酚類','花青素','顏色強度','色調','od280/od315稀釋葡萄酒','脯氨酸']
In [41]:
import graphviz
dot_data = tree.export_graphviz(clf
,out_file = None
,feature_names = feature_name
,class_names=["琴酒","雪莉","貝爾摩德"]
,filled=True
,rounded=True
)
graph = graphviz.Source(dot_data)
graph.view()
Out[41]:
'Source.gv.pdf'
2 解決原理¶
修改編碼方式為UTF-8,替換字體為仿宋。
3 解決方案¶
(1)解決方法一:
In [42]:
import graphviz
dot_data = tree.export_graphviz(clf
,out_file = 'tree.dot'
,feature_names = feature_name
,class_names=["琴酒","雪莉","貝爾摩德"]
,filled=True
,rounded=True
)
with open("tree.dot",encoding='utf-8') as f:
dot_graph = f.read()
graph=graphviz.Source(dot_graph.replace("helvetica","FangSong"))
graph.view()
Out[42]:
'Source.gv.pdf'
(2)解決方法二:
In [43]:
import graphviz
dot_data = tree.export_graphviz(clf
,out_file = None
,feature_names = feature_name
,class_names=["琴酒","雪莉","貝爾摩德"]
,filled=True
,rounded=True
)
graph = graphviz.Source(dot_data.replace("helvetica","FangSong").encode(encoding='utf-8'))
graph.view()
Out[43]:
'Source.gv.pdf'
兩者原理是相同的,根據是否需要輸出dot文件可選擇使用方式。