Workflow:Interpretml Interpret EBM Model Merging

Knowledge Sources	InterpretML InterpretML Docs Federated DP-EBM Paper
Domains	Machine_Learning, Interpretability, Federated_Learning
Last Updated	2026-02-07 12:00 GMT

Overview

End-to-end process for merging multiple independently trained Explainable Boosting Machines into a single unified model, enabling federated learning and distributed training scenarios.

Description

This workflow covers the process of combining two or more EBM models that were trained on separate data partitions into one merged model. This is essential for federated learning where data cannot be centralized due to privacy or regulatory constraints. The merge operation harmonizes differing bin structures, reconciles feature definitions, and averages term score tensors to produce a single model that approximates what would have been learned from the combined dataset. The merged model is a fully functional EBM that can generate predictions and explanations just like a model trained on centralized data.

Key outputs:

A single merged EBM model combining knowledge from all input models
Harmonized bin structures that span all observed value ranges
Averaged term scores with proper weighting across models

Scope:

Covers validation, bin harmonization, tensor mapping, and aggregation
Supports merging classifiers, regressors, or compatible combinations
Handles mismatched feature counts, bin boundaries, and interaction terms

Strategy:

Unify bin boundaries across all models using proportional remapping
Map score tensors from each model's bin structure to the unified structure
Average mapped tensors weighted by the number of samples from each model
Estimate missing terms from one model using weight distributions from other models

Usage

Execute this workflow when you have multiple EBM models trained on separate data partitions and need to combine them into a single model. Common scenarios include: federated learning across institutions that cannot share patient/customer data, distributed training across geographic regions for data residency compliance, ensemble building from models trained on different time periods, or combining models from differentially private training runs. All input models must share the same feature set and link function.

Execution Steps

Step 1: Validate Model Compatibility

Verify that all input models can be meaningfully merged. Check that models share compatible configurations including the same link function, link parameters, number of features, and compatible task types (all classifiers or all regressors).

Key considerations:

All models must be fitted (check_is_fitted)
Link functions must match across all models
Number of features must be identical
Task types must be compatible (classifier+classifier or regressor+regressor)
For classification: class sets should be compatible

Step 2: Harmonize Feature Bins

Create a unified bin structure that spans the value ranges observed across all input models. For continuous features, merge the cut points from all models into a single sorted set of bin boundaries. For categorical features, create a union of all observed categories. This unified structure serves as the common framework for remapping score tensors.

What happens:

For each feature, collect bin definitions from all models
Continuous features: merge and sort all cut points, removing duplicates
Categorical features: take the union of all category sets
Feature metadata (names, types, bounds) are reconciled
The result is a single bin structure that can represent values from any input model

Step 3: Remap Score Tensors

Map each model's term score tensors from their original bin structure to the unified bin structure. When bins from different models do not align exactly, scores are distributed proportionally based on the fraction of each old bin that falls within each new bin.

What happens:

For each term in each model, the original score tensor is remapped
Proportional mapping: if an old bin spans multiple new bins, its score is split proportionally
Bin weights are similarly remapped to track sample counts in the new structure
For interaction terms: the remapping applies independently along each dimension
Missing terms (present in one model but not another) are estimated using weight-based interpolation from models that do contain the term

Step 4: Average Across Models

Combine the remapped score tensors from all models into a single set of averaged term scores. Each model's contribution is weighted by the number of samples it was trained on, ensuring that models trained on larger datasets have proportionally more influence on the final result.

Key considerations:

Intercepts are averaged across all models
Term scores are weighted averages using bin weights as proxies for sample counts
Standard deviations are computed from the variance across model contributions
Bagged scores from individual models are preserved for downstream analysis
The averaging produces a model that approximates centralized training

Step 5: Finalize Merged Model

Construct the final merged EBM model object with all computed attributes. Apply postprocessing steps including score purification, term name generation, and metadata assembly. The resulting model is a fully functional EBM that can make predictions and generate explanations.

What happens:

A new EBM model object is created (classifier or regressor as appropriate)
Merged bins, term scores, intercept, and standard deviations are assigned
Extra bins are cleaned up (removing empty edge bins)
Term names are generated from the merged feature names
The model is marked as fitted and ready for prediction/explanation
All standard EBM methods (predict, explain_global, explain_local) work on the merged model

Execution Diagram

GitHub URL

Workflow Repository