Implementation:Scikit learn Scikit learn LabelPropagation
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Semi-Supervised Learning |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for semi-supervised classification using label propagation algorithms provided by scikit-learn.
Description
This module implements two semi-supervised classification algorithms based on label propagation on graphs: LabelPropagation and LabelSpreading. Both algorithms construct a fully-connected graph between all data points and propagate label information from labeled to unlabeled points. LabelPropagation uses hard clamping (labeled points never change), while LabelSpreading uses soft clamping (labels can change by a fraction alpha per iteration). Both support RBF and KNN kernels, with KNN producing sparse matrices for better scalability.
Usage
Use LabelPropagation or LabelSpreading when you have a small amount of labeled data and a large amount of unlabeled data. LabelSpreading is generally more robust to noise due to its soft clamping approach.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/semi_supervised/_label_propagation.py
Signature
class BaseLabelPropagation(ClassifierMixin, BaseEstimator, metaclass=ABCMeta):
"""Base class for label propagation module."""
...
class LabelPropagation(BaseLabelPropagation):
"""Label Propagation classifier."""
def __init__(
self,
kernel="rbf",
*,
gamma=20,
n_neighbors=7,
max_iter=1000,
tol=1e-3,
n_jobs=None,
):
...
class LabelSpreading(BaseLabelPropagation):
"""LabelSpreading model for semi-supervised learning."""
def __init__(
self,
kernel="rbf",
*,
gamma=20,
n_neighbors=7,
alpha=0.2,
max_iter=30,
tol=1e-3,
n_jobs=None,
):
...
Import
from sklearn.semi_supervised import LabelPropagation, LabelSpreading
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | array-like of shape (n_samples, n_features) | Yes | Training data (labeled and unlabeled) |
| y | array-like of shape (n_samples,) | Yes | Target labels; -1 indicates unlabeled samples |
| kernel | str or callable | No | Kernel function: 'rbf' or 'knn' (default: 'rbf') |
| gamma | float | No | RBF kernel parameter (default: 20) |
| n_neighbors | int | No | Number of neighbors for KNN kernel (default: 7) |
| alpha | float | No | Clamping factor for LabelSpreading (default: 0.2) |
| max_iter | int | No | Maximum iterations (default: 1000 for LP, 30 for LS) |
| tol | float | No | Convergence tolerance (default: 1e-3) |
Outputs
| Name | Type | Description |
|---|---|---|
| transduction_ | ndarray of shape (n_samples,) | Predicted labels for all samples including unlabeled |
| label_distributions_ | ndarray of shape (n_samples, n_classes) | Label probability distribution for each sample |
| classes_ | ndarray of shape (n_classes,) | Unique class labels |
| n_iter_ | int | Number of iterations run |
Usage Examples
Basic Usage
import numpy as np
from sklearn.semi_supervised import LabelPropagation
from sklearn.datasets import load_iris
iris = load_iris()
rng = np.random.RandomState(42)
# Mask some labels as unlabeled (-1)
labels = np.copy(iris.target)
random_unlabeled = rng.rand(len(labels)) < 0.3
labels[random_unlabeled] = -1
# Fit label propagation
lp = LabelPropagation(kernel='rbf', gamma=20)
lp.fit(iris.data, labels)
print("Accuracy on all:", lp.score(iris.data, iris.target))