谱嵌入 #

class sklearn.manifold.SpectralEmbedding(n_components=2, *, affinity='nearest_neighbors', gamma=None, random_state=None, eigen_solver=None, eigen_tol='auto', n_neighbors=None, n_jobs=None)[source]#

用于非线性降维的谱嵌入。

根据指定的函数形成亲和矩阵，并对相应的图拉普拉斯算子应用谱分解。最终的变换由每个数据点的特征向量值给出。

注意：此处实现的实际算法是拉普拉斯特征映射。

在用户指南中了解更多信息。

参数：

n_componentsint，默认为2

投影子空间的维数。

affinity{'nearest_neighbors', 'rbf', 'precomputed', 'precomputed_nearest_neighbors'} 或 callable，默认为'nearest_neighbors'

如何构建亲和矩阵。

‘nearest_neighbors’：通过计算最近邻图来构建亲和矩阵。
‘rbf’：通过计算径向基函数 (RBF) 核来构建亲和矩阵。
‘precomputed’：将X解释为预计算的亲和矩阵。
‘precomputed_nearest_neighbors’：将X解释为预计算最近邻的稀疏图，并通过选择n_neighbors个最近邻来构建亲和矩阵。
callable：使用传入的函数作为亲和函数，该函数接收数据矩阵 (n_samples, n_features) 并返回亲和矩阵 (n_samples, n_samples)。

gammafloat，默认为None

rbf 核的核系数。如果为 None，则 gamma 将设置为 1/n_features。

random_stateint、RandomState 实例或 None，默认为 None

当eigen_solver == 'amg'时，用于初始化 lobpcg 特征向量分解的伪随机数生成器，以及用于 K-Means 初始化的伪随机数生成器。使用 int 可以使结果在调用之间保持确定性（参见词汇表）。

注意

当使用eigen_solver == 'amg'时，还需要使用np.random.seed(int)来固定全局 numpy 种子以获得确定性结果。有关更多信息，请参见pyamg/pyamg#139。

eigen_solver{'arpack', 'lobpcg', 'amg'}，默认为 None

要使用的特征值分解策略。AMG 需要安装 pyamg。在非常大、稀疏的问题上，它可能更快。如果为 None，则使用'arpack'。

eigen_tolfloat，默认为“auto”

拉普拉斯矩阵特征分解的停止准则。如果eigen_tol="auto"，则传递的容差将取决于eigen_solver

如果eigen_solver="arpack"，则eigen_tol=0.0；
如果eigen_solver="lobpcg"或eigen_solver="amg"，则eigen_tol=None，这将配置底层的lobpcg求解器，使其根据其启发式算法自动确定值。详情请参见scipy.sparse.linalg.lobpcg。

请注意，当使用eigen_solver="lobpcg"或eigen_solver="amg"时，tol<1e-5的值可能会导致收敛问题，应避免。

1.2 版本中添加。

n_neighborsint，默认为 None

最近邻图构建的最近邻数。如果为 None，则 n_neighbors 将设置为 max(n_samples/10, 1)。

n_jobsint，默认为 None

要运行的并行作业数。None 表示 1，除非在joblib.parallel_backend 上下文中。-1 表示使用所有处理器。有关更多详细信息，请参见词汇表。

属性：

embedding_形状为 (n_samples, n_components) 的 ndarray: 训练矩阵的谱嵌入。
affinity_matrix_形状为 (n_samples, n_samples) 的 ndarray: 从样本或预计算中构建的亲和矩阵。
n_features_in_int: 在拟合期间看到的特征数。

0.24 版本中添加。
feature_names_in_形状为 (n_features_in_,) 的 ndarray: 在拟合期间看到的特征名称。仅当X具有全是字符串的特征名称时才定义。

1.0 版本中添加。
n_neighbors_int: 有效使用的最近邻数。

另请参见

等度量映射 (Isomap): 通过等距映射进行非线性降维。

参考文献

示例

>>> from sklearn.datasets import load_digits
>>> from sklearn.manifold import SpectralEmbedding
>>> X, _ = load_digits(return_X_y=True)
>>> X.shape
(1797, 64)
>>> embedding = SpectralEmbedding(n_components=2)
>>> X_transformed = embedding.fit_transform(X[:100])
>>> X_transformed.shape
(100, 2)

fit(X, y=None)[source]#

根据X中的数据拟合模型。

参数：

X形状为 (n_samples, n_features) 的 {array-like, sparse matrix}

训练向量，其中 n_samples 是样本数，n_features 是特征数。

如果 affinity 为“precomputed”，X：形状为 (n_samples, n_samples) 的 {array-like, sparse matrix}，将 X 解释为根据样本计算的预计算邻接图。

y忽略

未使用，根据惯例为了 API 一致性而保留。

返回:

self对象: 返回实例本身。

fit_transform(X, y=None)[source]#

根据X中的数据拟合模型并转换X。

参数：

X形状为 (n_samples, n_features) 的 {array-like, sparse matrix}

训练向量，其中 n_samples 是样本数，n_features 是特征数。

如果 affinity 为“precomputed”，X：形状为 (n_samples, n_samples) 的 {array-like, sparse matrix}，将 X 解释为根据样本计算的预计算邻接图。

y忽略

未使用，根据惯例为了 API 一致性而保留。

返回:

X_new形状为 (n_samples, n_components) 的 array-like: 训练矩阵的谱嵌入。

get_metadata_routing()[source]#

获取此对象的元数据路由。

请查看用户指南，了解路由机制的工作原理。

返回:

routingMetadataRequest: 一个 MetadataRequest 封装了路由信息。

get_params(deep=True)[source]#

获取此估计器的参数。

参数：

deepbool, default=True: 如果为 True，则将返回此估计器和作为估计器的包含子对象的参数。

返回:

paramsdict: 参数名称与其值的映射。

set_params(**params)[source]#

设置此估计器的参数。

此方法适用于简单的估计器以及嵌套对象（例如 Pipeline）。后者具有 <component>__<parameter> 形式的参数，因此可以更新嵌套对象的每个组件。

参数：

**paramsdict: 估计器参数。

返回:

self估计器实例: 估计器实例。

图库示例#

数字的二维嵌入上的各种凝聚聚类

流形学习方法的比较

割裂球体上的流形学习方法

手写数字上的流形学习：局部线性嵌入、Isomap……