Implementation:Scikit learn Scikit learn MutualInfoClassif
| Knowledge Sources | |
|---|---|
| Domains | Feature Selection, Information Theory |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for estimating mutual information between features and a target variable provided by scikit-learn.
Description
The _mutual_info module provides mutual_info_classif and mutual_info_regression functions that estimate mutual information between each feature and the target variable. Mutual information measures the dependency between variables and is zero if and only if two random variables are independent. Unlike correlation-based methods, mutual information can capture any kind of statistical dependency, including non-linear relationships. The implementation is based on the k-nearest neighbors approach by Kraskov et al.
Usage
Use mutual_info_classif for feature selection when you need to rank features by their statistical dependency with a discrete target variable (classification). Use mutual_info_regression for continuous targets. These functions are particularly useful as score functions for SelectKBest or SelectPercentile, especially when non-linear relationships between features and targets are expected.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/feature_selection/_mutual_info.py
Signature
def mutual_info_classif(
X,
y,
*,
discrete_features="auto",
n_neighbors=3,
copy=True,
random_state=None,
n_jobs=None,
):
def mutual_info_regression(
X,
y,
*,
discrete_features="auto",
n_neighbors=3,
copy=True,
random_state=None,
n_jobs=None,
):
Import
from sklearn.feature_selection import mutual_info_classif
from sklearn.feature_selection import mutual_info_regression
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | array-like of shape (n_samples, n_features) | Yes | Feature matrix. |
| y | array-like of shape (n_samples,) | Yes | Target variable (discrete for classif, continuous for regression). |
| discrete_features | 'auto', bool, or array-like | No | Whether features are discrete. 'auto' treats them as continuous. Default is 'auto'. |
| n_neighbors | int | No | Number of neighbors for MI estimation. Default is 3. |
| copy | bool | No | Whether to make a copy of the given data. Default is True. |
| random_state | int or RandomState | No | Random state for reproducibility (used to break ties in neighbor search). |
| n_jobs | int | No | Number of parallel jobs. Default is None (1 job). |
Outputs
| Name | Type | Description |
|---|---|---|
| mi | ndarray of shape (n_features,) | Estimated mutual information between each feature and the target in nat units. |
Usage Examples
Basic Usage
from sklearn.feature_selection import mutual_info_classif
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=200, n_features=10, n_informative=3, random_state=42)
mi_scores = mutual_info_classif(X, y, random_state=42)
# Print feature importances
for i, score in enumerate(mi_scores):
print(f"Feature {i}: MI = {score:.4f}")
# Use with SelectKBest
from sklearn.feature_selection import SelectKBest
selector = SelectKBest(mutual_info_classif, k=3)
X_selected = selector.fit_transform(X, y)
print(f"Selected features shape: {X_selected.shape}")