Implementation:Scikit learn Scikit learn VarianceThreshold
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Feature Selection |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for removing low-variance features from datasets, provided by scikit-learn.
Description
The VarianceThreshold class is a feature selector that removes all features whose variance does not meet a specified threshold. It operates only on features (X) and not on targets (y), making it suitable for unsupervised feature selection. It supports both dense and sparse input matrices and allows NaN values.
Usage
Use this selector as a simple baseline feature selection step to remove constant or near-constant features before applying more sophisticated feature selection or modeling techniques.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/feature_selection/_variance_threshold.py
Signature
class VarianceThreshold(SelectorMixin, BaseEstimator):
def __init__(self, threshold=0.0):
Import
from sklearn.feature_selection import VarianceThreshold
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| threshold | float | No | Variance threshold; features below this are removed (default 0.0) |
| X | array-like of shape (n_samples, n_features) | Yes | Training data for fitting |
Outputs
| Name | Type | Description |
|---|---|---|
| variances_ | ndarray of shape (n_features,) | Variances of individual features |
| n_features_in_ | int | Number of features seen during fit |
| X_transformed | ndarray | Reduced feature matrix with low-variance features removed |
Usage Examples
Basic Usage
from sklearn.feature_selection import VarianceThreshold
X = [[0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [0, 1, 0], [0, 1, 1]]
sel = VarianceThreshold(threshold=0.16)
X_selected = sel.fit_transform(X)
print(X_selected.shape) # Features with variance > 0.16 are kept