Implementation:Scikit learn Scikit learn SparseFuncs
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Sparse Matrices |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete utility module for efficient operations on sparse matrices provided by scikit-learn.
Description
The sparsefuncs module provides a collection of utilities for working with sparse CSR and CSC matrices. It includes inplace scaling operations (inplace_csr_column_scale, inplace_csr_row_scale), mean/variance computation along axes, min/max operations, sparse matrix multiplication to dense output, and count-of-nonzero functions. These operations avoid materializing full dense matrices.
Usage
Use these functions when you need to perform feature-wise or sample-wise operations on sparse matrices without converting them to dense format, such as during feature scaling, normalization, or statistical computation.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/utils/sparsefuncs.py
Signature
def inplace_csr_column_scale(X, scale):
...
def inplace_csr_row_scale(X, scale):
...
def mean_variance_axis(X, axis, weights=None, return_sum_weights=False):
...
def incr_mean_variance_axis(X, *, axis, last_mean, last_var, last_n, weights=None):
...
def min_max_axis(X, axis, ignore_nan=False):
...
def count_nonzero(X, axis=None, sample_weight=None):
...
def sparse_matmul_to_dense(A, B):
...
Import
from sklearn.utils.sparsefuncs import (
inplace_csr_column_scale,
mean_variance_axis,
count_nonzero,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | sparse matrix (CSR or CSC) | Yes | Sparse matrix to operate on |
| scale | ndarray | Yes | Scale factors for inplace scaling operations |
| axis | int | Yes | Axis along which to compute (0 for columns, 1 for rows) |
| weights | ndarray | No | Sample weights for weighted computations |
Outputs
| Name | Type | Description |
|---|---|---|
| means | ndarray | Mean values along the specified axis |
| variances | ndarray | Variance values along the specified axis |
| mins | ndarray | Minimum values along the specified axis |
| maxs | ndarray | Maximum values along the specified axis |
Usage Examples
Basic Usage
import numpy as np
from scipy import sparse
from sklearn.utils.sparsefuncs import inplace_csr_column_scale, mean_variance_axis
# Create a sparse matrix
X = sparse.random(100, 10, density=0.3, format="csr")
# Compute mean and variance along columns
means, variances = mean_variance_axis(X, axis=0)
print(means.shape) # (10,)
# Inplace column scaling
scale = np.array([2.0] * 10)
inplace_csr_column_scale(X, scale)