绘制随机生成的分类数据集#

此示例绘制了几个随机生成的分类数据集。为了便于可视化,所有数据集都有 2 个特征,绘制在 x 和 y 轴上。每个点的颜色代表其类别标签。

前 4 个图使用 make_classification,具有不同数量的信息特征、每个类别的簇和类别。最后 2 个图使用 make_blobsmake_gaussian_quantiles

One informative feature, one cluster per class, Two informative features, one cluster per class, Two informative features, two clusters per class, Multi-class, two informative features, one cluster, Three blobs, Gaussian divided into three quantiles
import matplotlib.pyplot as plt

from sklearn.datasets import make_blobs, make_classification, make_gaussian_quantiles

plt.figure(figsize=(8, 8))
plt.subplots_adjust(bottom=0.05, top=0.9, left=0.05, right=0.95)

plt.subplot(321)
plt.title("One informative feature, one cluster per class", fontsize="small")
X1, Y1 = make_classification(
    n_features=2, n_redundant=0, n_informative=1, n_clusters_per_class=1
)
plt.scatter(X1[:, 0], X1[:, 1], marker="o", c=Y1, s=25, edgecolor="k")

plt.subplot(322)
plt.title("Two informative features, one cluster per class", fontsize="small")
X1, Y1 = make_classification(
    n_features=2, n_redundant=0, n_informative=2, n_clusters_per_class=1
)
plt.scatter(X1[:, 0], X1[:, 1], marker="o", c=Y1, s=25, edgecolor="k")

plt.subplot(323)
plt.title("Two informative features, two clusters per class", fontsize="small")
X2, Y2 = make_classification(n_features=2, n_redundant=0, n_informative=2)
plt.scatter(X2[:, 0], X2[:, 1], marker="o", c=Y2, s=25, edgecolor="k")

plt.subplot(324)
plt.title("Multi-class, two informative features, one cluster", fontsize="small")
X1, Y1 = make_classification(
    n_features=2, n_redundant=0, n_informative=2, n_clusters_per_class=1, n_classes=3
)
plt.scatter(X1[:, 0], X1[:, 1], marker="o", c=Y1, s=25, edgecolor="k")

plt.subplot(325)
plt.title("Three blobs", fontsize="small")
X1, Y1 = make_blobs(n_features=2, centers=3)
plt.scatter(X1[:, 0], X1[:, 1], marker="o", c=Y1, s=25, edgecolor="k")

plt.subplot(326)
plt.title("Gaussian divided into three quantiles", fontsize="small")
X1, Y1 = make_gaussian_quantiles(n_features=2, n_classes=3)
plt.scatter(X1[:, 0], X1[:, 1], marker="o", c=Y1, s=25, edgecolor="k")

plt.show()

**脚本总运行时间:**(0 分钟 0.395 秒)

相关示例

SGD:最大间隔超平面

SGD:最大间隔超平面

二分类 AdaBoost

二分类 AdaBoost

高斯过程分类 (GPC) 的等概率线

高斯过程分类 (GPC) 的等概率线

比较 BIRCH 和 MiniBatchKMeans

比较 BIRCH 和 MiniBatchKMeans

由 Sphinx-Gallery 生成的图库