fowlkes_mallows_score#

sklearn.metrics.fowlkes_mallows_score(labels_true, labels_pred, *, sparse='deprecated')[源码]#

衡量一组点两个聚类结果的相似性。

在 0.18 版本中新增。

Fowlkes-Mallows 指数 (FMI) 定义为精确率和召回率的几何平均值

FMI = TP / sqrt((TP + FP) * (TP + FN))

其中 TP 是 真阳性 的数量（即在 labels_true 和 labels_pred 中都属于同一簇的点对的数量），FP 是 假阳性 的数量（即在 labels_pred 中属于同一簇但在 labels_true 中不属于同一簇的点对的数量），FN 是 假阴性 的数量（即在 labels_true 中属于同一簇但在 labels_pred 中不属于同一簇的点对的数量）。

分值范围从 0 到 1。高值表示两个聚类之间有很好的相似性。

在用户指南中阅读更多内容。

参数:

labels_true形如 (n_samples,) 的类数组对象，dtype=int: 数据的互斥子集的聚类结果。
labels_pred形如 (n_samples,) 的类数组对象，dtype=int: 数据的互斥子集的聚类结果。
sparsebool, 默认为 False: 在内部使用稀疏矩阵计算列联表。

自 1.7 版本弃用：sparse 参数已弃用，并将在 1.9 版本中移除。它没有效果。

返回:

score浮点型: 结果的 Fowlkes-Mallows 分数。

参考文献

[1]

E. B. Fowkles and C. L. Mallows, 1983. “A method for comparing two hierarchical clusterings”. Journal of the American Statistical Association

[2]

Fowlkes-Mallows 指数的维基百科条目

示例

完美的标签既具有同质性又具有完整性，因此得分为 1.0

>>> from sklearn.metrics.cluster import fowlkes_mallows_score
>>> fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1])
1.0
>>> fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0])
1.0

如果类别成员完全分散在不同的簇中，则分配是完全随机的，因此 FMI 为空

>>> fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3])
0.0

fowlkes_mallows_score#

当前页面