Implementation:Interpretml Interpret Clean Dimensions
| Field | Value |
|---|---|
| Sources | InterpretML |
| Domains | Data_Preprocessing, Validation |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
Clean_Dimensions is a concrete tool for validating and normalizing raw input data dimensions provided by the InterpretML library.
Description
The clean_dimensions function accepts any array-like input (lists, tuples, DataFrames, Series, masked arrays, sparse matrices) and converts it to a validated numpy ndarray with correct dimensionality. It handles edge cases like single-element arrays, masked arrays, and pandas types. This function is a key component of the data preparation pipeline, ensuring that all downstream operations receive data in a consistent and validated format.
Usage
Import this function when you need to validate raw user-provided y, sample_weight, or other 1D/2D data before feeding it into EBM training.
Code Reference
Source Location
- Repository
interpretml/interpret- File
python/interpret-core/interpret/utils/_clean_simple.py- Lines
- 49--230
Signature
def clean_dimensions(data, param_name):
Import
from interpret.utils._clean_simple import clean_dimensions
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
data |
Any (array-like) | Yes | Raw input data to validate |
param_name |
str | Yes | Name used in error messages for debugging |
Outputs
| Name | Type | Description |
|---|---|---|
| return | np.ndarray |
Cleaned numpy array with validated dimensions |
Usage Examples
Basic Usage
import numpy as np
from interpret.utils._clean_simple import clean_dimensions
# Clean a list into a numpy array
y_raw = [1, 0, 1, 1, 0]
y_clean = clean_dimensions(y_raw, "y")
# y_clean is now np.array([1, 0, 1, 1, 0])
# Clean a pandas Series
import pandas as pd
weights = pd.Series([1.0, 2.0, 1.5])
weights_clean = clean_dimensions(weights, "sample_weight")