roc_curve#

sklearn.metrics.roc_curve(y_true, y_score, *, pos_label=None, sample_weight=None, drop_intermediate=True)[source]#

计算受试者工作特征（ROC）曲线。

注意：此实现仅限于二分类任务。

在用户指南中阅读更多内容。

参数:

y_true形状为 (n_samples,) 的类数组: 真实的二元标签。如果标签不是 {-1, 1} 或 {0, 1}，则应显式给出 pos_label。
y_score形状为 (n_samples,) 的类数组: 目标分数，可以是正类的概率估计、置信值，或未经过阈值处理的决策度量（如某些分类器的“decision_function”所返回）。对于decision_function分数，大于或等于零的值应表示正类。
pos_labelint, float, bool 或 str, 默认值=None: 正类的标签。当 pos_label=None 时，如果 y_true 在 {-1, 1} 或 {0, 1} 中，则 pos_label 设置为 1，否则将引发错误。
sample_weight形状为 (n_samples,) 的类数组, 默认值=None: 样本权重。
drop_intermediatebool, 默认值=True: 是否删除在 ROC 空间中与相邻点共线的阈值点。这不影响 ROC AUC 或曲线的视觉形状，但会减少绘制点的数量。

0.17 版本新增: 参数 drop_intermediate。

返回:

fpr形状为 (>2,) 的 ndarray: 递增的假正率，其中元素 i 是分数 >= thresholds[i] 的预测的假正率。
tpr形状为 (>2,) 的 ndarray: 递增的真正率，其中元素 i 是分数 >= thresholds[i] 的预测的真正率。
thresholds形状为 (n_thresholds,) 的 ndarray: 用于计算 fpr 和 tpr 的决策函数上的递减阈值。第一个阈值设置为 np.inf。

1.3 版本更改: 添加了一个任意的无穷大阈值（存储在 thresholds[0] 中），以表示一个始终预测负类的分类器，即 fpr=0 和 tpr=0。

另请参阅

RocCurveDisplay.from_estimator: 根据估计器和数据绘制受试者工作特征（ROC）曲线。
RocCurveDisplay.from_predictions: 根据真实值和预测值绘制受试者工作特征（ROC）曲线。
det_curve: 计算不同概率阈值下的错误率。
roc_auc_score: 计算 ROC 曲线下的面积。

注意

由于阈值是从低到高排序的，返回时会对其进行反转，以确保它们与 fpr 和 tpr 对应，而 fpr 和 tpr 在计算时是反向排序的。

参考文献

[1]

维基百科关于受试者工作特征的条目

[2]

Fawcett T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8):861-874.

示例

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
>>> fpr
array([0. , 0. , 0.5, 0.5, 1. ])
>>> tpr
array([0. , 0.5, 0.5, 1. , 1. ])
>>> thresholds
array([ inf, 0.8 , 0.4 , 0.35, 0.1 ])

图库示例#

roc_curve#

图库示例#

本页