二分類模型 AUC 評價法

對於二分類模型，其實既可以構建分類器，也可以構建回歸（比如同一個二分類問題既可以用 SVC 又可以 SVR，python 的 sklearn 中 SVC 和 SVR 是分開的，R 的 e1701 中都在 svm 中，僅當 y 變量是 factor 類型時構建 SVC，否則構建 SVR）。

二分類模型的評價指標很多，這里僅敘述 AUC 這個指標。分類問題中，正類預測 score 大於負類預測 score 的概率即是 C 指數（Mann–Whitney U 檢驗的 C 統計量），也稱 AUC，推導可以參考：AUC 與 Mann–Whitney U test 以及 wikipedia/Mann_Whitney_U_test。AUC 的具體原理此處不再敘述，可以參考相關資料，比如這兩個還行：ROC 和 AUC 介紹以及如何計算 AUC、機器學習之分類性能度量指標，以及這兩個也有些不錯的思考：AUC 原理介紹及求解方法總結。

其實，ROC 的另一種呈現可以是直方圖（或密度圖）形式，即橫坐標是 score，直方圖（密度圖）的顏色是真實類別，比如下面這種：

若構建 regression，可以直接將 predict 的值和真實值直接扔進 auc 函數里去計算，就是讓程序去逐個找 predict 的 cutoff 值就可以構建 ROC 了。

但是如果是 classifier，因為直接 predict 的值是 0 或 1，無法計算 auc，此時需要借助於 “預測概率”，sklearn 中常調用 predict_proba 函數來獲取。另外，Logistics 回歸，python 的 predict 也是 0 或 1，也需要調用 predict_proba 函數來獲取相應 “預測概率”。還有個 decision_function，其意義是當這個值大於 0 時，相應的樣本預測為正例。R 中不會有這些問題，R 都是簡單易用的。

AUC 的計算舉例：

test_auc = metrics.roc_auc_score(y_test,y_test_pre)

ROC 的計算舉例：

fpr, tpr, thresholds = metrics.roc_curve(y_test,y_test_pre)
plt.plot(fpr, tpr, 'b', label='AUC = %0.2f' % test_auc)

Classifier 的其他相關評價指標

准確度 accuracy：可以用 classifier.score 計算 accuracy。理解為正確率，就是分類正確的占總數的比例，即 (TP+TN)/Total。

二分類問題中，當其中某一類數量遠小於另一類時，如果追求准確度，那么只需要將分類結果全指定為數量多的那類即可。所以這種情況下僅用 accuracy 評價是不夠的。

精確度 precision：又叫 “查准率”、用 P 表示。這是針對其中一類而言。比如我建模的目的是找出正例，那么 precision 就是真正的正例 / 找出來的所有，即 TP/(TP+FP)。

召回率 recall：又叫 “查全率”、“靈敏度”、“真陽性率 TPR”，用 R 表示。也是針對其中一類而言。比如建模的目的是找出正例，那么 recall 就是真正的正例 / 所有的正例，即 TP/(TP+FN)。另，假陰性率 FNR（漏診率）=FN/(TP+FN)，FNR=1-R。

真陰性率 TNR：又叫 “特異度”，TNR=TN/(TN+FP)。假陽性率 FPR：又叫 “誤診率”。TNR+FPR=1。（還記得 ROC 的橫坐標嗎）

總而言之，准確率就是找得對，召回率就是找得全。

關於靈敏度和特異度的關系：如果說靈敏度為正例的召回率，那么特異度就是反例的召回率。當正例和反例同樣重要時，靈敏度和特異度的重要性也一樣。所以，在無偏的分類模型中（各類的重要性相同），由靈敏度和特異度組合的指標（如 AUC）是十分重要的指標。注意，當模型是為了找出特殊人群（比如找出高危人群）時，識別特殊人群的靈敏度，要比特異度更重要。

你問問一個模型，這堆東西是不是某個類的時候，准確率就是 它說是，這東西就確實是的概率吧，召回率就是， 它說是，但它漏說了（1 - 召回率）這么多。

F1 值：是精確值和召回率的調和均值，即 2/F1=1/precision+1/recall。Fβ是更一般的形式，對 precision 和 recall 加權。而 F1 是其特殊情況，認為 precision 和 recall 同等重要。推廣的話還有 macro-P、macro-R、macro-F1 及 micro-P、micro-R、micro-F1 等。

貼張圖（來自：機器學習】分類性能度量指標 : ROC 曲線、AUC 值、正確率、召回率、敏感度、特異度）

另外 wiki 上也有張圖：

准確率和召回率是互相影響的，理想情況下肯定是做到兩者都高，但是一般情況下准確率高、召回率就低，召回率低、准確率高，當然如果兩者都低，那是什么地方出問題了。

如果是做搜索，那就是保證召回的情況下提升准確率；如果做疾病監測、反垃圾，則是保准確率的條件下，提升召回。

所以，在兩者都要求高的情況下，可以用 F1 來衡量。

P/R 和 ROC 是兩個不同的評價指標和計算方式，一般情況下，檢索用前者，分類、識別等用后者。

參考一篇還不錯的博客：【機器學習】分類性能度量指標 : ROC 曲線、AUC 值、正確率、召回率、敏感度、特異度

一般認為，AUC＜0.6 時區分度較差，0.6-0.75 時區分能力一般，＞7.5 時區分能力較好。

兩個模型的 AUC 是可以進行統計學差異檢驗的，采用 Z 檢驗，統計量 Z 近似服從正態分布。計算公式如下：

SE1 和 SE2 分別為 AUC1 和 AUC2 的標准誤。

貼幾段代碼

# SVR與SVC的AUC/ROC計算
import numpy as np
from sklearn.svm import SVR,SVC
from sklearn.model_selection import train_test_split
from sklearn import metrics
 
x_train, x_test, y_train, y_test = train_test_split(X, Y,  train_size=0.7)
 
print("------------------------------ SVC ------------------------------------------")
clf = SVC(kernel='rbf', C=100, gamma=0.0001, probability=True)
clf.fit(x_train, y_train)
 
y_train_pre = clf.predict(x_train)
y_test_pre = clf.predict(x_test)
print("Accuracy: "+str(clf.score(x_train,y_train)))  
 
y_train_predict_proba = clf.predict_proba(x_train) #每一類的概率
false_positive_rate, recall, thresholds = roc_curve(y_train, y_train_predict_proba[:, 1])
train_auc=auc(false_positive_rate,recall)
print("train AUC: "+str(train_auc))
 
print("------------------------------------")
print("Accuracy: "+str(clf.score(x_test,y_test)))
 
y_test_predict_proba = clf.predict_proba(x_test) #每一類的概率
false_positive_rate, recall, thresholds = roc_curve(y_test, y_test_predict_proba[:, 1])
test_auc=auc(false_positive_rate,recall)
print("test AUC: "+str(test_auc))
 
plt.figure(0)
plt.title('ROC of SVM in test data')
plt.plot(false_positive_rate, recall, 'b', label='AUC = %0.2f' % test_auc)
plt.legend(loc='lower right')
plt.plot([0,1],[0,1],'r--')
plt.xlim([0.0,1.0])
plt.ylim([0.0,1.0])
plt.ylabel('Recall')
plt.xlabel('Fall-out')
plt.show()
 
print("--------------------------- SVR ------------------------------------------")
 
reg = SVR(kernel='rbf', C=100, gamma=0.0001)
reg.fit(x_train, y_train)
y_train_pre = reg.predict(x_train)
y_test_pre = reg.predict(x_test)
train_auc = metrics.roc_auc_score(y_train,y_train_pre)
print("train AUC: "+str(train_auc))
 
print("--------------------------------")
 
test_auc = metrics.roc_auc_score(y_test,y_test_pre)
print("test AUC: "+str(test_auc))
fpr, tpr, thresholds = metrics.roc_curve(y_test,y_test_pre)
 
plt.figure(1)
plt.title('ROC of SVR in test data')
plt.plot(fpr, tpr, 'b', label='AUC = %0.2f' % test_auc)
plt.legend(loc='lower right')
plt.plot([0,1],[0,1],'r--')
plt.xlim([0.0,1.0])
plt.ylim([0.0,1.0])
plt.ylabel('Recall')
plt.xlabel('Fall-out')
plt.show()

Logistics 回歸代碼段

# Logistics regression
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve,auc
from sklearn.model_selection import train_test_split
 
# input X、y
 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) 
# clf = LogisticRegression(random_state=0, solver='lbfgs' ,multi_class='multinomial')
clf = LogisticRegression()
clf.fit(x_train, y_train)
 
# 一下幾行僅用於展現那幾個函數的作用，實際使用不應隨便挑幾個數據驗證
logi_pre=clf.predict(X[:5, :])
logi_pro=clf.predict_proba(X[:5, :]) 
logi_accuracy=clf.score(x_test, y_test)
logi_deci=clf.decision_function(X[-5:,:])
print(y)
print("prediction of first 5 samples: ",end=" ")
print(logi_pre)
print("prediction probability of first 5 samples: ")
print(logi_pro)
print("decision_function of last 5 samples(大於0時，正類被預測): ",end=" ")
print(logi_deci)
print("prediction accuracy of test data: ",end=" ")
print(logi_accuracy)
 
predictions=clf.predict_proba(x_test)#每一類的概率
false_positive_rate, recall, thresholds = roc_curve(y_test, predictions[:, 1])
roc_auc=auc(false_positive_rate,recall)
plt.title('ROC of logistics in test data')
plt.plot(false_positive_rate, recall, 'b', label='AUC = %0.2f' % roc_auc)
plt.legend(loc='lower right')
plt.plot([0,1],[0,1],'r--')
plt.xlim([0.0,1.0])
plt.ylim([0.0,1.0])
plt.ylabel('Recall')
plt.xlabel('Fall-out')
plt.show()

AUC/ROC 計算的 sklearn 官網舉例

print(__doc__)
# ROC for model
 
import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle
 
from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from scipy import interp
 
# Import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target
 
# Binarize the output
y = label_binarize(y, classes=[0, 1, 2])
n_classes = y.shape[1]
 
# Add noisy features to make the problem harder
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape
X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]
 
# shuffle and split training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5,
                                                    random_state=0)
 
# Learn to predict each class against the other
classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
                                 random_state=random_state))
y_score = classifier.fit(X_train, y_train).decision_function(X_test)
 
# Compute ROC curve and ROC area for each class
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
 
# Compute micro-average ROC curve and ROC area
fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])
 
plt.figure()
lw = 2
plt.plot(fpr[2], tpr[2], color='darkorange',
         lw=lw, label='ROC curve (area = %0.2f)' % roc_auc[2])
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")
plt.show()

題外話：SVM 參數設置的案例

import numpy as np
from sklearn.svm import SVR,SVC
import matplotlib.pyplot as plt
 
# #############################################################################
# Generate sample data
X = np.sort(16 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel()
 
# #############################################################################
# Add noise to targets
y[::5] += 3 * (0.5 - np.random.rand(16))
 
# Fit regression model
svr_rbf = SVC(kernel='rbf', C=1e3, gamma=0.1)
# svr_rbf = SVR(kernel='rbf', C=1e3, gamma=100) #可能過擬合
# svr_lin = SVR(kernel='linear', C=1e3)
# svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_rbf = svr_rbf.fit(X, y).predict(X)
 
# Look at the results
lw = 2
plt.scatter(X, y, color='darkorange', label='data')
plt.plot(X, y_rbf, color='navy', lw=lw, label='RBF model')
# plt.plot(X, y_lin, color='c', lw=lw, label='Linear model')
# plt.plot(X, y_poly, color='cornflowerblue', lw=lw, label='Polynomial model')
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.show()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 二分類問題的評價指標：ROC，AUC 二分類模型評估之AUC ROC Data Mining | 二分類模型評估-ROC/AUC/K-S/GINI 二分類算法的評價指標：准確率、精准率、召回率、混淆矩陣、AUC 二分類模型之logistic 二分類Logistic回歸模型關於二分類的評價指標體系機器學習二分類模型評價指標:准確率\召回率\特異度等二分類二分類問題中的混淆矩陣、ROC以及AUC評估指標