fowlkes_mallows_score#

sklearn.metrics.fowlkes_mallows_score(labels_true, labels_pred, *, sparse='deprecated')[source]#

衡量一组点的两个聚类的相似性。

版本 0.18 新增。

Fowlkes-Mallows 指数（FMI）定义为精确率和召回率的几何平均值。

FMI = TP / sqrt((TP + FP) * (TP + FN))

其中 TP 是 **真阳性**（即在 labels_true 和 labels_pred 中都属于同一簇的点对的数量），FP 是 **假阳性**（即在 labels_pred 中属于同一簇但在 labels_true 中不属于同一簇的点对的数量），FN 是 **假阴性**（即在 labels_true 中属于同一簇但在 labels_pred 中不属于同一簇的点对的数量）。

得分范围从 0 到 1。高值表示两个簇之间具有良好的相似性。

在用户指南中阅读更多内容。

参数:

labels_true形状为 (n_samples,) 的类数组，dtype=int: 将数据聚类为不相交子集的结果。
labels_pred形状为 (n_samples,) 的类数组，dtype=int: 将数据聚类为不相交子集的结果。
sparsebool, default=False: 在内部使用稀疏矩阵计算列联表。

自版本 1.7 起已弃用: sparse 参数已弃用，并将在 1.9 版本中移除。它没有任何作用。

返回:

scorefloat: 得到的 Fowlkes-Mallows 分数。

References

[1]

E. B. Fowkles and C. L. Mallows, 1983. “A method for comparing two hierarchical clusterings”. Journal of the American Statistical Association

[2]

Fowlkes-Mallows 指数的维基百科条目

示例

完美的标签既是同质的又是完整的，因此分数为 1.0

>>> from sklearn.metrics.cluster import fowlkes_mallows_score
>>> fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1])
1.0
>>> fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0])
1.0

如果类别成员完全分散在不同的簇中，则分配完全随机，因此 FMI 为零。

>>> fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3])
0.0

fowlkes_mallows_score#

本页