Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Interpretml Interpret EBM Training And Prediction

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Interpretability, Generalized_Additive_Models
Last Updated 2026-02-07 12:00 GMT

Overview

End-to-end process for training an Explainable Boosting Machine (EBM) on tabular data and generating predictions using the InterpretML library.

Description

This workflow covers the complete pipeline for building an interpretable machine learning model using the Explainable Boosting Machine, the flagship model of InterpretML. EBMs are a modern implementation of Generalized Additive Models with pairwise interactions (GA2Ms) that combine gradient boosting, bagging, and automatic interaction detection to achieve accuracy competitive with blackbox models while remaining fully interpretable. The process begins with raw tabular data and produces a fitted model capable of both classification and regression, with exact per-feature contribution scores for every prediction.

Key outputs:

  • A fitted EBM model with learned feature shape functions (term scores)
  • Per-feature contribution scores for each prediction
  • Standard deviations across bags for confidence estimation

Scope:

  • Covers data cleaning, feature binning, boosting, and prediction
  • Supports classification (binary and multiclass) and regression
  • Includes differential privacy variants (DP-EBM)

Strategy:

  • Uses quantile-based binning to discretize continuous features
  • Applies cyclic gradient boosting with one feature at a time
  • Employs bagging (outer bags) for variance reduction and confidence intervals
  • Automatically detects and includes pairwise feature interactions

Usage

Execute this workflow when you have a tabular dataset (structured data with rows and columns) and need to train a machine learning model that provides exact, auditable explanations for every prediction. This is particularly suited for regulated industries (healthcare, finance, insurance) where model transparency is required, or when domain experts need to review and potentially edit model behavior. The EBM handles mixed feature types (continuous, categorical, and string data) natively without requiring manual preprocessing.

Execution Steps

Step 1: Data Preparation and Validation

Clean and validate the input training data. The framework accepts pandas DataFrames, numpy arrays, and handles string/categorical data natively. Input features (X) and target variable (y) are validated for consistency. Optional sample weights and initialization scores can be provided for custom weighting or transfer learning scenarios.

Key considerations:

  • The framework auto-detects feature types (continuous vs categorical) from the data
  • Missing values are handled natively through a dedicated missing bin
  • For classification, target labels are automatically mapped to class indices
  • Sample weights allow emphasizing certain observations during training

Step 2: Feature Binning and Discretization

Transform raw feature values into discrete bins suitable for the boosting algorithm. Continuous features are binned using quantile-based cuts (by default) to create roughly equal-frequency bins. Categorical features are mapped to integer bin indices. This step produces a hierarchical bin structure: finer bins for main effects and coarser bins for interaction terms to manage computational complexity.

What happens:

  • Quantile cuts are computed for continuous features respecting minimum samples per bin
  • Categorical values are enumerated and mapped to bin indices
  • Bin weights (sample counts per bin) are recorded for importance calculations
  • Feature bounds (min/max observed values) are stored for later visualization
  • Histogram data is computed for density plots in explanations
  • For DP-EBM: noise is added to bin boundaries using the privacy budget

Step 3: Interaction Detection

Automatically identify pairs of features that have significant interaction effects. The framework uses a FAST (Functional ANOVA Screening Technique) algorithm to rank candidate feature pairs by their interaction strength, then selects the top interactions to include as additional terms in the model.

Key considerations:

  • Interaction detection runs on the native C++ engine for performance
  • The number of interactions is controlled by the max_interaction_bins and interactions parameters
  • Each selected interaction creates a two-dimensional lookup table (tensor) of scores
  • Users can also manually specify interactions to include or exclude

Step 4: Bagged Gradient Boosting

Train the model using an ensemble of boosting iterations across multiple bags (bootstrap samples). Each outer bag creates an independent training/validation split. Within each bag, the boosting algorithm cycles through individual features, fitting one-dimensional trees to the residuals for each feature in turn. This round-robin approach ensures the model learns additive contributions that can be attributed to individual features.

What happens:

  • Multiple outer bags are created (default 8 for classification, 8 for regression)
  • Each bag gets a random train/validation split
  • The boosting loop cycles through all terms (features and interactions)
  • For each term, a one-dimensional tree is fit to the current residuals
  • Term updates are applied with a learning rate to prevent overfitting
  • A greedy phase selects high-gain terms for additional boosting rounds
  • Early stopping monitors validation loss to prevent overfitting
  • Smoothing rounds add random splits to improve generalization
  • For DP-EBM: Gaussian noise is added to each term update based on the privacy budget

Step 5: Model Aggregation and Postprocessing

Aggregate results from all outer bags to produce the final model. Term scores (shape functions) from each bag are averaged to produce stable estimates. Standard deviations across bags provide confidence intervals. The intercept (base prediction) is computed as the mean prediction.

Key considerations:

  • Term scores from all bags are averaged for the final model
  • Standard deviations across bags quantify model uncertainty
  • Score tensors are purified using functional ANOVA decomposition to ensure identifiability
  • For multiclass: separate score tensors are maintained for each class
  • The model stores all bag-level results for later merge operations

Step 6: Prediction

Generate predictions for new data by evaluating the learned shape functions. Each feature value is looked up in its corresponding term score table, and contributions from all terms are summed together with the intercept. The resulting raw score is passed through a link function to produce the final prediction.

What happens:

  • New data is binned using the same bin cuts learned during training
  • Each feature value maps to a bin, which maps to a score contribution
  • All term contributions are summed: prediction = intercept + sum(term_scores)
  • For classification: logistic link function converts scores to probabilities
  • For regression: identity link returns scores directly
  • The predict_proba method returns class probabilities for classifiers
  • The predict method returns the most likely class or regression value

Execution Diagram

GitHub URL

Workflow Repository