Overview
ClassHistogram and Marginal are data exploration explainers that produce interactive visualizations of feature distributions and feature-response relationships for classification and regression datasets respectively.
Description
This module provides two data-level explainer classes that conform to the InterpretML explainer API:
- Marginal: Generates marginal plots for each feature, showing scatter plots of feature values versus response values along with density histograms for both axes. For continuous features, it also computes the Pearson correlation coefficient. For categorical (nominal/ordinal) features, it displays box plots instead of scatter plots.
- ClassHistogram: Generates stacked histogram visualizations for classification problems, showing how each feature's distribution differs across target classes. The overall view shows the class distribution as a bar chart, while per-feature views show class-conditional histograms stacked on top of each other.
Both classes produce Plotly-based interactive figures through their corresponding explanation objects (MarginalExplanation and ClassHistogramExplanation).
Usage
Use Marginal when you want to explore relationships between features and a continuous or classification response in a regression or general setting. Use ClassHistogram when you specifically want to visualize how feature distributions differ across classes in a classification problem. Both are designed for exploratory data analysis before model training.
Code Reference
Source Location
Signature
class Marginal(ExplainerMixin):
available_explanations = ["data"]
explainer_type = "data"
def __init__(
self,
feature_names=None,
feature_types=None,
max_scatter_samples=400,
):
def explain_data(self, X, y, name=None):
class ClassHistogram(ExplainerMixin):
available_explanations = ["data"]
explainer_type = "data"
def __init__(self, feature_names=None, feature_types=None):
def explain_data(self, X, y, name=None):
Import
from interpret.data import Marginal, ClassHistogram
I/O Contract
Marginal
Inputs
| Name |
Type |
Required |
Description
|
| feature_names |
list of str |
No |
List of feature names for columns of X
|
| feature_types |
list of str |
No |
List of feature types (e.g. "continuous", "nominal", "ordinal")
|
| max_scatter_samples |
int |
No |
Maximum number of sample points to display in scatter plots (default 400)
|
explain_data Inputs
| Name |
Type |
Required |
Description
|
| X |
numpy array or compatible |
Yes |
Feature matrix to explore
|
| y |
numpy array |
Yes |
Response vector (1-dimensional)
|
| name |
str |
No |
User-defined explanation name
|
Outputs
| Name |
Type |
Description
|
| explanation |
MarginalExplanation |
Explanation object with per-feature marginal plots and overall response density
|
ClassHistogram
Inputs
| Name |
Type |
Required |
Description
|
| feature_names |
list of str |
No |
List of feature names for columns of X
|
| feature_types |
list of str |
No |
List of feature types (e.g. "continuous", "nominal", "ordinal")
|
explain_data Inputs
| Name |
Type |
Required |
Description
|
| X |
numpy array or compatible |
Yes |
Feature matrix to explore
|
| y |
numpy array |
Yes |
Classification labels (1-dimensional)
|
| name |
str |
No |
User-defined explanation name
|
Outputs
| Name |
Type |
Description
|
| explanation |
ClassHistogramExplanation |
Explanation object with per-feature class-conditional histograms
|
Usage Examples
Basic Marginal Example
from interpret.data import Marginal
from sklearn.datasets import load_boston
import numpy as np
X, y = load_boston(return_X_y=True)
marginal = Marginal(max_scatter_samples=200)
explanation = marginal.explain_data(X, y, name="Boston Data Exploration")
# Visualize overall response distribution
explanation.visualize(key=None)
# Visualize a specific feature (feature index 0)
explanation.visualize(key=0)
Basic ClassHistogram Example
from interpret.data import ClassHistogram
from sklearn.datasets import load_iris
import numpy as np
X, y = load_iris(return_X_y=True)
class_hist = ClassHistogram()
explanation = class_hist.explain_data(X, y, name="Iris Data Exploration")
# Visualize overall class distribution
explanation.visualize(key=None)
# Visualize class-conditional histogram for a specific feature
explanation.visualize(key=0)
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.