Implementation:Scikit learn Scikit learn DictionaryLearning
| Knowledge Sources | |
|---|---|
| Domains | Dimensionality Reduction, Sparse Coding |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for dictionary learning and sparse coding provided by scikit-learn.
Description
DictionaryLearning finds a dictionary (a set of atoms) that performs well at sparsely encoding the fitted data. It solves an optimization problem that minimizes the Frobenius norm of the reconstruction error subject to an L1 penalty on the codes and unit-norm constraints on the dictionary atoms. The module also provides SparseCoder for encoding data against a fixed dictionary, and MiniBatchDictionaryLearning for scalable online dictionary learning.
Usage
Use DictionaryLearning when you need to learn an overcomplete dictionary from data for sparse representation, feature extraction, or signal decomposition tasks. It is particularly useful for image denoising, texture analysis, and compressed sensing applications where sparse representations are desirable.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/decomposition/_dict_learning.py
Signature
class DictionaryLearning(_BaseSparseCoding, BaseEstimator):
def __init__(
self,
n_components=None,
*,
alpha=1,
max_iter=1000,
tol=1e-8,
fit_algorithm="lars",
transform_algorithm="omp",
transform_n_nonzero_coefs=None,
transform_alpha=None,
n_jobs=None,
code_init=None,
dict_init=None,
callback=None,
verbose=False,
split_sign=False,
random_state=None,
positive_code=False,
positive_dict=False,
transform_max_iter=1000,
):
Import
from sklearn.decomposition import DictionaryLearning
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_components | int | No | Number of dictionary elements to extract. Defaults to n_features if None. |
| alpha | float | No | Sparsity controlling parameter (default=1.0). |
| max_iter | int | No | Maximum number of iterations to perform (default=1000). |
| tol | float | No | Tolerance for numerical error (default=1e-8). |
| fit_algorithm | str | No | Algorithm for fitting: 'lars' or 'cd' (default='lars'). |
| transform_algorithm | str | No | Algorithm for transform: 'lasso_lars', 'lasso_cd', 'lars', 'omp', or 'threshold' (default='omp'). |
| transform_n_nonzero_coefs | int | No | Number of nonzero coefficients to target during transform. |
| transform_alpha | float | No | Regularization parameter for transform. Defaults to alpha. |
| n_jobs | int | No | Number of parallel jobs to run. |
| random_state | int or RandomState | No | Random state for reproducibility. |
Outputs
| Name | Type | Description |
|---|---|---|
| components_ | ndarray of shape (n_components, n_features) | The dictionary atoms extracted from the data. |
| error_ | array | Vector of errors at each iteration. |
| n_features_in_ | int | Number of features seen during fit. |
| n_iter_ | int | Number of iterations run. |
Usage Examples
Basic Usage
import numpy as np
from sklearn.decomposition import DictionaryLearning
X = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]], dtype=float)
dl = DictionaryLearning(n_components=2, transform_algorithm='lasso_lars', random_state=0)
X_transformed = dl.fit_transform(X)
print(X_transformed.shape) # (3, 2)
print(dl.components_.shape) # (2, 4)