sklearn.metrics中的評估方法

本文轉載自查看原文 2020-02-13 16:15 6278 python機器學習筆記

https://www.cnblogs.com/mindy-snail/p/12445973.html

1.confusion_matrix

利用混淆矩陣進行評估
混淆矩陣說白了就是一張表格-
所有正確的預測結果都在對角線上，所以從混淆矩陣中可以很方便直觀的看出哪里有錯誤，因為他們呈現在對角線外面。
舉個直觀的例子

這個表格是一個混淆矩陣

正確的值是上邊的表格，混淆矩陣是下面的表格，這就表示，apple應該有兩個，但是只預測對了一個，其中一個判斷為banana了，banana應該有8ge，但是5個預測對了3個判斷為pear了，pear有應該有6個，但是2個判斷為apple了，可見對角線上是正確的預測值，對角線之外的都是錯誤的。
這個混淆矩陣的實現代碼

from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
y_test=["a","b","p","b","b","b","b","p","b","p","b","b","p","p","p","a"]
y_pred=["a","b","p","p","p","p","b","p","b","p","b","b","a","a","p","b"]
confusion_matrix(y_test, y_pred,labels=["a", "b","p"])
#array([[1, 1, 0],
       [0, 5, 3],
       [2, 0, 4]], dtype=int64)

print(classification_report(y_test,y_pred))
##
               precision    recall  f1-score   support

          a       0.33      0.50      0.40         2
          b       0.83      0.62      0.71         8
          p       0.57      0.67      0.62         6

avg / total       0.67      0.62      0.64        16

我傳到github上面了

復現代碼1

# Import necessary modules
from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix

# Create training and test set
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.4,random_state=42)

# Instantiate a k-NN classifier: knn
knn = KNeighborsClassifier(6)

# Fit the classifier to the training data
knn.fit(X_train,y_train)

# Predict the labels of the test data: y_pred
y_pred = knn.predict(X_test)

# Generate the confusion matrix and classification report
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

復現代碼2

補充知識

先給一個二分類的例子
其他同理

TP(True Positive)：將正類預測為正類數，真實為0，預測也為0
FN(False Negative)：將正類預測為負類數，真實為0，預測為1
FP(False Positive)：將負類預測為正類數，真實為1，預測為0
TN(True Negative)：將負類預測為負類數，真實為1，預測也為1

因此:預測性分類模型，肯定是希望越准越好。那么，對應到混淆矩陣中，那肯定是希望TP與TN的數量大，而FP與FN的數量小。所以當我們得到了模型的混淆矩陣后，就需要去看有多少觀測值在第二、四象限對應的位置，這里的數值越多越好；反之，在第一、三四象限對應位置出現的觀測值肯定是越少越好。

幾個二級指標定義

准確率（Accuracy）—— 針對整個模型
\(\frac{t p+t n}{t p+t n+f p+f n}\)
精確率（Precision）
\(\frac{t p}{t p+f n}\)
靈敏度（Sensitivity）：就是召回率（Recall）召回率 = 提取出的正確信息條數 / 樣本中的信息條數。通俗地說，就是所有准確的條目有多少被檢索出來了
特異度（Specificity）

三級指標

\(\mathrm{F} 1\) Score \(=\frac{2 \mathrm{PR}}{\mathrm{P}+\mathrm{R}}\)
其中，P代表Precision，R代表Recall。
F1-Score指標綜合了Precision與Recall的產出的結果。F1-Score的取值范圍從0到1的，1代表模型的輸出最好，0代表模型的輸出結果最差reference

2.accuracy_score()

分類准確率分數

分類准確率分數是指所有分類正確的百分比。分類准確率這一衡量分類器的標准比較容易理解，但是它不能告訴你響應值的潛在分布，並且它也不能告訴你分類器犯錯的類型

sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None)
#normalize：默認值為True，返回正確分類的比例；如果為False，返回正確分類的樣本數

復現代碼1

#accuracy_score
import numpy as np
from sklearn.metrics import accuracy_score
y_pred = [1, 9, 9, 5,1,0,2,2]
y_true = [1,9,9,8,0,6,1,2]
print(accuracy_score(y_true, y_pred))
print(accuracy_score(y_true, y_pred, normalize=False))
# 0.5
# 4

復現代碼2

datacamp上面的一個例子

# Import necessary modules
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.model_selection import train_test_split

# Create feature and target arrays
X = digits.data
y = digits.target

# Split into training and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=42, stratify=y)

# Create a k-NN classifier with 7 neighbors: knn
knn = KNeighborsClassifier(n_neighbors=7)

# Fit the classifier to the training data
knn.fit(X_train, y_train)
y_pred=knn.predict(X_test)
# Print the accuracy
print(accuracy_score(y_test, y_pred))

#0.89996709

ROC

ROC曲線指受試者工作特征曲線/接收器操作特性(receiveroperating characteristic，ROC)曲線,
是反映靈敏性和特效性連續變量的綜合指標,是用構圖法揭示敏感性和特異性的相互關系，
它通過將連續變量設定出多個不同的臨界值，從而計算出一系列敏感性和特異性。
ROC曲線是根據一系列不同的二分類方式（分界值或決定閾），以真正例率（也就是靈敏度recall）（True Positive Rate,TPR）為縱坐標，假正例率（1-特效性，）（False Positive Rate,FPR）為橫坐標繪制的曲線。
要與混淆矩陣想結合

橫軸FPR

\(\mathrm{FPR}=\frac{\mathrm{FP}}{\mathrm{FP}+\mathrm{TN}}\)
在所有真實值為Negative的數據中，被模型錯誤的判斷為Positive的比例

如果兩個概念熟，那就多看幾遍
😄

縱軸recall

這個好理解就是找回來
在所有真實值為Positive的數據中，被模型正確的判斷為Positive的比例
\(\mathrm{TPR}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}\)

ROC曲線解讀

FPR與TPR分別構成了ROC曲線的橫縱軸，因此我們知道在ROC曲線中，每一個點都對應着模型的一次結果
如果ROC曲線完全在縱軸上，代表這一點上，x=0，即FPR=0。模型沒有把任何negative的數據錯誤的判為positive，預測完全准確
不知道哪個大佬能做出來。。❤️
如果ROC曲線完全在橫軸上，代表這一點上，y=0，即TPR=0。模型沒有把任何positive的數據正確的判斷為positive，預測完全不准確。
平心而論，這種模型能做出來也是蠻牛的，因為模型真正做到了完全不准確，所以只要反着看結果就好了嘛😄
如果ROC曲線完全與右上方45度傾角線重合，證明模型的准確率是正好50%，錯判的幾率是一半一半

-因此，我們繪制出來ROC曲線的形狀，是希望TPR大，而FPR小。因此對應在圖上就是曲線盡量往左上角貼近。45度的直線一般被常用作Benchmark，即基准模型，我們的預測分類模型的ROC要能優於45度線，否則我們的預測還不如50/50的猜測來的准確

ROC曲線繪制

ROC曲線上的一系列點，代表選取一系列的閾值（threshold）產生的結果
在分類問題中，我們模型預測的結果不是negative/positive。而是一個negatvie或positive的概率。那么在多大的概率下我們認為觀測值應該是negative或positive呢？這個判定的值就是閾值（threshold）。
ROC曲線上眾多的點，每個點都對應着一個閾值的情況下模型的表現。多個點連起來就是ROC曲線了

API實現

sklearn.metrics.roc_curve(y_true,y_score,pos_label=None, sample_weight=None, drop_intermediate=True)

# Import the necessary modules
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix ,classification_report

# Create training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state=42)

# Create the classifier: logreg
logreg = LogisticRegression()

# Fit the classifier to the training data
logreg.fit(X_train,y_train)

# Predict the labels of the test set: y_pred
y_pred = logreg.predict(X_test)

# Compute and print the confusion matrix and classification report
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))


# Import necessary modules
from sklearn.metrics import roc_curve

# Compute predicted probabilities: y_pred_prob
y_pred_prob = logreg.predict_proba(X_test)[:,1]

# Generate ROC curve values: fpr, tpr, thresholds
fpr, tpr, thresholds = roc_curve(y_test, y_pred_prob)

# Plot ROC curve
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')

AUC （Area under the ROC curve）

AUC它就是值ROC曲線下的面積是多大。每一條ROC曲線對應一個AUC值。AUC的取值在0與1之間。
AUC = 1，代表ROC曲線在縱軸上，預測完全准確。不管Threshold選什么，預測都是100%正確的。
0.5 < AUC < 1，代表ROC曲線在45度線上方，預測優於50/50的猜測。需要選擇合適的閾值后，產出模型。
AUC = 0.5，代表ROC曲線在45度線上，預測等於50/50的猜測。
0 < AUC < 0.5，代表ROC曲線在45度線下方，預測不如50/50的猜測。
AUC = 0，代表ROC曲線在橫軸上，預測完全不准確。

實現

sklearn.metrics.auc(x, y, reorder=False)

# Import necessary modules
from sklearn.model_selection import cross_val_score
from sklearn.metrics import roc_auc_score

# Compute predicted probabilities: y_pred_prob
y_pred_prob = logreg.predict_proba(X_test)[:,1]

# Compute and print AUC score
print("AUC: {}".format(roc_auc_score(y_test, y_pred_prob)))

# Compute cross-validated AUC scores: cv_auc
cv_auc = cross_val_score(logreg, X, y, cv=5, scoring='roc_auc')

# Print list of AUC scores
print("AUC scores computed using 5-fold cross-validation: {}".format(cv_auc))

<script.py> output:
    AUC: 0.8254806777079764
    AUC scores computed using 5-fold cross-validation: [0.80148148 0.8062963  0.81481481 0.86245283 0.8554717 ]

Precision-recall Curve

召回曲線也可以作為評估模型好壞的標准

which is generated by plotting the precision and recall for different thresholds. As a reminder, precision and recall are defined as:
Precision \(=\frac{T P}{T P+F P}\)
Recall\(=\frac{T P}{T P+F N}\)

Hold-out set

模型評估之流出法

直接將數據集D划分為兩個互斥的集合，其中一個集合作為訓練集S，另外一個作為測試集T，即D=S∪T,S∩T=0.在S上訓練出模型后，用T來評估其測試誤差，作為對泛化誤差的評估
訓練/測試集的划分要盡可能的保持數據分布的一致性，避免因數據划分過程引入額外的偏差而對最終結果產生影響
在給定訓練/測試集的樣本比例后，仍然存在多種划分方式對初始數據集D進行划分，可能會對模型評估的結果產生影響。因此，單次使用留出法得到的結果往往不夠穩定可靠，在使用留出法時，一般采用若干次隨機划分、重復進行實驗評估后取得平均值作為留出法的評估結果
此外。我們希望評估的是用D訓練出的模型的性能，但是留出法需划分訓練/測試集，這就會導致一個窘境：若另訓練集S包含大多數的樣本，則訓練出的模型可能更接近於D訓練出的模型，但是由於T比較小，評估結果可能不夠穩定准確；若另測試集T包含多一些樣本，則訓練集S與D的差別更大，被評估的模型與用D訓練出的模型相比可能就會有較大的誤差，從而降低了評估結果的保真性（fidelity）。因此，常見的做法是：將大約2/3~4/5的樣本用於訓練，剩余樣本作為測試參考

# Import necessary modules
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

# Create the hyperparameter grid
c_space = np.logspace(-5, 8, 15)
param_grid = {'C': c_space, 'penalty': ['l1', 'l2']}

# Instantiate the logistic regression classifier: logreg
logreg = LogisticRegression()

# Create train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state=42)

# Instantiate the GridSearchCV object: logreg_cv
logreg_cv = GridSearchCV(logreg, param_grid, cv=5)

# Fit it to the training data
logreg_cv.fit(X_train, y_train)

# Print the optimal parameters and best score
print("Tuned Logistic Regression Parameter: {}".format(logreg_cv.best_params_))
print("Tuned Logistic Regression Accuracy: {}".format(logreg_cv.best_score_))

<script.py> output:
    Tuned Logistic Regression Parameter: {'C': 0.4393970560760795, 'penalty': 'l1'}
    Tuned Logistic Regression Accuracy: 0.7652173913043478

# Import necessary modules
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split

# Create train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state=42)

# Create the hyperparameter grid
l1_space = np.linspace(0, 1, 30)
param_grid = {'l1_ratio': l1_space}

# Instantiate the ElasticNet regressor: elastic_net
elastic_net = ElasticNet()

# Setup the GridSearchCV object: gm_cv
gm_cv = GridSearchCV(elastic_net, param_grid, cv=5)

# Fit it to the training data
gm_cv.fit(X_train, y_train)

# Predict on the test set and compute metrics
y_pred = gm_cv.predict(X_test)
r2 = gm_cv.score(X_test, y_test)
mse = mean_squared_error(y_test, y_pred)
print("Tuned ElasticNet l1 ratio: {}".format(gm_cv.best_params_))
print("Tuned ElasticNet R squared: {}".format(r2))
print("Tuned ElasticNet MSE: {}".format(mse))

<script.py> output:
    Tuned ElasticNet l1 ratio: {'l1_ratio': 0.20689655172413793}
    Tuned ElasticNet R squared: 0.8668305372460283
    Tuned ElasticNet MSE: 10.05791413339844

classification_report（）

測試模型精度的方法很多，可以看下官方文檔的例子，記一些常用的即可
API官方文檔
https://scikit-learn.org/stable/modules/classes.html

MSE&RMSE

方差，標准差
MSE:\((y_真實-y_預測)^2\)之和
RMSE：MSE開平方

https://www.cnblogs.com/mindy-snail/p/12445973.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 sklearn.metrics中的評估方法 sklearn.metrics 模型評估指標 sklearn.metrics中的評估方法介紹（accuracy_score, recall_score, roc_curve, roc_auc_score, confusion_matrix） sklearn.metrics【指標】 sklearn.metrics中的confusion_matrix、ROC、AUC指標 Python Sklearn.metrics 簡介及應用示例 Sklearn.metrics類的學習筆記----Classification metrics sklearn——metrics模型評估指標 sklearn中的回歸器性能評估方法 [sklearn]性能度量之AUC值（from sklearn.metrics import roc_auc_curve）