注意
转到结尾下载完整的示例代码。或通过JupyterLite或Binder在浏览器中运行此示例
将希腊硬币图片分割成区域#
此示例使用基于图像体素间差异创建的图上的谱聚类将图像分解成多个部分同质的区域。
此过程(图像上的谱聚类)是寻找归一化图割的有效近似解。
有三种分配标签的选项:
‘kmeans’ 谱聚类使用 kmeans 算法在嵌入空间中对样本进行聚类。
‘discrete’ 迭代搜索最接近谱聚类嵌入空间的划分空间。
‘cluster_qr’ 使用带旋转的 QR 分解来直接确定嵌入空间中的划分,从而分配标签。
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
import time
import matplotlib.pyplot as plt
import numpy as np
from scipy.ndimage import gaussian_filter
from skimage.data import coins
from skimage.transform import rescale
from sklearn.cluster import spectral_clustering
from sklearn.feature_extraction import image
# load the coins as a numpy array
orig_coins = coins()
# Resize it to 20% of the original size to speed up the processing
# Applying a Gaussian filter for smoothing prior to down-scaling
# reduces aliasing artifacts.
smoothened_coins = gaussian_filter(orig_coins, sigma=2)
rescaled_coins = rescale(smoothened_coins, 0.2, mode="reflect", anti_aliasing=False)
# Convert the image into a graph with the value of the gradient on the
# edges.
graph = image.img_to_graph(rescaled_coins)
# Take a decreasing function of the gradient: an exponential
# The smaller beta is, the more independent the segmentation is of the
# actual image. For beta=1, the segmentation is close to a voronoi
beta = 10
eps = 1e-6
graph.data = np.exp(-beta * graph.data / graph.data.std()) + eps
# The number of segmented regions to display needs to be chosen manually.
# The current version of 'spectral_clustering' does not support determining
# the number of good quality clusters automatically.
n_regions = 26
计算并可视化生成的区域。
# Computing a few extra eigenvectors may speed up the eigen_solver.
# The spectral clustering quality may also benefit from requesting
# extra regions for segmentation.
n_regions_plus = 3
# Apply spectral clustering using the default eigen_solver='arpack'.
# Any implemented solver can be used: eigen_solver='arpack', 'lobpcg', or 'amg'.
# Choosing eigen_solver='amg' requires an extra package called 'pyamg'.
# The quality of segmentation and the speed of calculations is mostly determined
# by the choice of the solver and the value of the tolerance 'eigen_tol'.
# TODO: varying eigen_tol seems to have no effect for 'lobpcg' and 'amg' #21243.
for assign_labels in ("kmeans", "discretize", "cluster_qr"):
t0 = time.time()
labels = spectral_clustering(
graph,
n_clusters=(n_regions + n_regions_plus),
eigen_tol=1e-7,
assign_labels=assign_labels,
random_state=42,
)
t1 = time.time()
labels = labels.reshape(rescaled_coins.shape)
plt.figure(figsize=(5, 5))
plt.imshow(rescaled_coins, cmap=plt.cm.gray)
plt.xticks(())
plt.yticks(())
title = "Spectral clustering: %s, %.2fs" % (assign_labels, (t1 - t0))
print(title)
plt.title(title)
for l in range(n_regions):
colors = [plt.cm.nipy_spectral((l + 4) / float(n_regions + 4))]
plt.contour(labels == l, colors=colors)
# To view individual segments as appear comment in plt.pause(0.5)
plt.show()
# TODO: After #21194 is merged and #21243 is fixed, check which eigen_solver
# is the best and set eigen_solver='arpack', 'lobpcg', or 'amg' and eigen_tol
# explicitly in this example.
Spectral clustering: kmeans, 2.16s
Spectral clustering: discretize, 1.84s
Spectral clustering: cluster_qr, 1.77s
脚本总运行时间:(0 分钟 6.168 秒)
相关示例
用于图像分割的谱聚类
硬币图像上结构化 Ward 层次聚类的演示
可视化股票市场结构
手写数字数据上 K 均值聚类的演示