Implementation:Interpretml Interpret PartitionRandomBoosting
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, EBM_Core |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
PartitionRandomBoosting is a C++ module that performs random split selection for EBM boosting, primarily used for training differentially private EBM models.
Description
This module implements a random partitioning strategy for boosting interaction terms, where split points are chosen randomly rather than optimally. This is the primary splitting algorithm used when differential privacy is enabled, since optimal split selection would leak information about the training data.
The algorithm works as follows:
- Collapse degenerate dimensions: Dimensions with only 1 bin are collapsed, and the remaining real dimensions are processed.
- Determine splits per dimension: For each dimension, computes the number of random splits based on
aLeavesMax, capping at the number of available bins. - Allocate and generate random splits: Uses the
RandomDeterministicRNG to generate random split positions for each dimension, then sorts them to ensure ascending order. - Construct the collapsed tensor: Iterates over all cells in the collapsed tensor, mapping each cell to its corresponding original bin and accumulating gradient/hessian statistics.
- Compute update scores: For each cell in the collapsed tensor, computes the boosting update using regularization (alpha, lambda), Newton or gradient-only updates, and step size constraints. Handles monotone constraints for single-dimension terms.
- Build the output tensor: Maps the collapsed tensor splits back to the original tensor dimensions and sets the scores in the output
Tensor.
The module handles both hessian-aware (Newton) updates and gradient-only updates based on the TermBoostFlags.
Usage
This module is called during boosting when random splitting is selected, which is primarily for differential privacy scenarios. It provides the split selection mechanism that avoids data-dependent split point optimization.
Code Reference
Source Location
- Repository: Interpretml_Interpret
- File:
shared/libebm/PartitionRandomBoosting.cpp
Signature
template<bool bHessian, size_t cCompilerScores>
class PartitionRandomBoostingInternal final {
public:
static ErrorEbm Func(
RandomDeterministic* const pRng,
BoosterShell* const pBoosterShell,
const Term* const pTerm,
const TermBoostFlags flags,
const FloatCalc regAlpha,
const FloatCalc regLambda,
const FloatCalc deltaStepMax,
const IntEbm* const aLeavesMax,
const MonotoneDirection monotoneDirection,
double* const pTotalGain);
};
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| pRng | RandomDeterministic* | Yes | Deterministic random number generator |
| pBoosterShell | BoosterShell* | Yes | Boosting context containing bins and model state |
| pTerm | const Term* | Yes | The interaction term being boosted |
| flags | TermBoostFlags | Yes | Boosting flags (DisableNewtonGain, DisableNewtonUpdate) |
| regAlpha | FloatCalc | Yes | L1 regularization parameter |
| regLambda | FloatCalc | Yes | L2 regularization parameter |
| deltaStepMax | FloatCalc | Yes | Maximum update step size |
| aLeavesMax | const IntEbm* | Yes | Maximum leaves per dimension |
| monotoneDirection | MonotoneDirection | Yes | Monotone constraint direction (NONE for interactions) |
Outputs
| Name | Type | Description |
|---|---|---|
| return value | ErrorEbm | Error code (Error_None on success) |
| pTotalGain | double* | Total gain from the random split |
| pBoosterShell (internal) | BoosterShell* | Updated inner term update tensor |
Usage Examples
Pipeline Context
# This C++ module is called internally via the native bindings
# primarily when training with differential privacy
from interpret.glassbox import DPExplainableBoostingClassifier
ebm = DPExplainableBoostingClassifier(epsilon=1.0)
ebm.fit(X, y) # Internally calls PartitionRandomBoosting for DP-safe splits