DET曲線(檢測誤差權衡曲線)


DET曲線

DET曲線即Detection error tradeoff (DET) curve,檢測誤差權衡曲線。功能類似於ROC曲線,但有時DET曲線更容易判斷分類器的性能。

參考sklearn中的介紹

DET curves are commonly plotted in normal deviate scale. (DET曲線通常以正常偏差尺度繪制。)

To achieve this plot_det_curve transforms the error rates as returned by the det_curve and the axis scale using scipy.stats.norm.

The point of this example is to demonstrate two properties of DET curves, namely:

  1. It might be easier to visually assess the overall performance of different classification algorithms using DET curves over ROC curves. Due to the linear scale used for plotting ROC curves, different classifiers usually only differ in the top left corner of the graph and appear similar for a large part of the plot. On the other hand, because DET curves represent straight lines in normal deviate scale. As such, they tend to be distinguishable as a whole and the area of interest spans a large part of the plot.

  2. DET curves give the user direct feedback of the detection error tradeoff to aid in operating point analysis. The user can deduct directly from the DET-curve plot at which rate false-negative error rate will improve when willing to accept an increase in false-positive error rate (or vice-versa)

ROC曲線和DET曲線的對比

關於DET曲線更詳細的論述參考論文:

Martin, A., Doddington, G., Kamm, T., Ordowski, M., & Przybocki, M. (1997). The DET curve in assessment of detection task performance. National Inst of Standards and Technology Gaithersburg MD.

 DET曲線的繪制

(1)sklearn

sklearn中提供DET曲線的繪制接口。

fpr_det, fnr_det, thresholds_det = metrics.det_curve(label_test, test_scores, pos_label=1)

# plot DET curve (in normal deviate scale)
display = metrics.DetCurveDisplay(fpr=fpr_det, fnr=fnr_det)
display.plot()
plt.show()

DET曲線

(2)matlab

繪制DET曲線通常是在正態偏差尺度下繪制的,因此繪制之前需要進行數據尺度變換。

參考sklearn中metrics.DetCurveDisplay(fpr=fpr_det, fnr=fnr_det)的實現,可以看到幾個關鍵的變換步驟如下:

sp.stats.norm.ppf(self.fpr)
sp.stats.norm.ppf(self.fnr)

ticks = [0.001, 0.01, 0.05, 0.20, 0.5, 0.80, 0.95, 0.99, 0.999]
tick_locations = sp.stats.norm.ppf(ticks)

tick_labels = [
            '{:.0%}'.format(s) if (100*s).is_integer() else '{:.1%}'.format(s)
            for s in ticks
        ]



ax.set_xlim(-3, 3)
ax.set_ylim(-3, 3) 

 這里sp.stats.norm.ppf()返回CDF中的x,即累計分布函數的逆函數(分位點函數,給出分位點返回對應的x值)。

這等價於matlab中的norminv(x, mu, sigma),因此matlab中通過以下方式繪制DET曲線:

DET_test = load('DET.txt');

fnr = norminv(DET_test(:, 1), 0, 1);  % 轉換為正態偏差尺度Normal deviation scale
fpr = norminv(DET_test(:, 2), 0, 1);  % 轉換為正態偏差尺度Normal deviation scale

figure
plot(fnr, fpr, 'linewidth', 2)
xlabel('False negative rate')
ylabel('False positive rate')

% 坐標軸尺度轉換(轉換為正態偏差尺度Normal deviation scale)
ticks = norminv([0.001, 0.01, 0.05, 0.20, 0.5, 0.80, 0.95, 0.99, 0.999]);
ticklabels = {'0.1%', '1%', '5%', '20%', '50%', '80%', '95%', '99%', '99.9%'};
xticks(ticks)
yticks(ticks)
xticklabels(ticklabels)
yticklabels(ticklabels)
xlim([-3, 3]) % [-3sigma, +3sigma]
ylim([-3, 3])

  DET曲線

可以看出,結果與sklearn的結果一致。

 

線性尺度下的DET曲線:

DET曲線(線性尺度)

 https://juliahub.com/docs/ROCAnalysis/GJ3BH/0.3.3/

 https://nbviewer.jupyter.org/github/davidavdav/ROCAnalysis.jl/blob/master/ROCAnalysis.ipynb

 

A Detection Error Trade-off plot (DET plot) shows the same information as the ROC plot above---but the scales are warped according to the inverse of the cumulative normal distribution. This way of plotting has many advantages:

  • If the distributions of target and non-target scores are both Normal, then the DET-curve is a straight line. In practice, many detection problems give rise to more-or-less straight DET curves, and this suggests that there exists a strictly increasing warping function that can make the score distributions (more) Normal.

  • Towards better performance (lower error rates), the resolution of the graph is higher. This makes it more easy to have multiple systems / performance characteristics over a smaller or wider range of performance in the same graph, and still be able to tell these apart.

  • Conventionally, the ranges of the axes are chosen 0.1%--50%---and the plot area should really be square. This makes it possible to immediately assess the overall performance based on the absolute position of the line in the graph if you have seen more DET plots in your life.

  • The slope of the (straight) line corresponds to the ratio of the σ parameters of the underlying Normal score distributions, namely that of the non-target scores divided by that of the target scores. Often, highly discriminative classifiers show very flat curves, indicating that that target scores have a much larger variance than the non-target scores.

  • The origin of this type of plot lies in psychophysics, where graph paper with lines according to this warping was referred to as double probability paper. The diagonal y=xy=x in a DET plot corresponds linearly to a quantity known as dd′ (d-prime) from psychophysics, ranging from 0 at 50% error to about 6 at 0.1% error.

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM