Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Interpretml Interpret PartitionMultiDimensionalTree

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, EBM_Core
Last Updated 2026-02-07 12:00 GMT

Overview

PartitionMultiDimensionalTree is a C++ module that builds a multi-dimensional decision tree over interaction terms during EBM boosting, enabling recursive tree-based splits across multiple dimensions.

Description

This module implements a tree-based partitioning algorithm for multi-dimensional interaction terms. Unlike the corner-based or straight-based algorithms, this approach builds an actual decision tree over the multi-dimensional bin space using TreeNodeMulti nodes, allowing splits on any dimension at each tree node.

Key components include:

  • MakeTensor: Traverses the completed tree to construct the boosting update tensor. It walks the tree nodes, collects split points per dimension, and builds the full tensor with predictions for each leaf region. It supports optional purification via PurifyInternal to decompose the multi-dimensional tensor update into interpretable components. When purification is enabled, it also outputs tensor weights, gradients, and hessians for the purification algorithm.
  • PartitionMultiDimensionalTreeInternal::Func: The main tree-building function that performs a best-first (priority-queue-based) greedy tree construction:
    • Starts with a root node containing all bins
    • Uses TensorTotalsSum to evaluate potential splits along each dimension
    • Pushes promising splits into a priority queue ordered by gain
    • Pops the best split and creates child nodes
    • Continues until the maximum number of leaves is reached or no beneficial splits remain
    • Respects regularization parameters (alpha, lambda), minimum leaf size, minimum hessian, and maximum step size constraints

The tree nodes use TreeNodeMulti which stores both the split dimension index and split position, allowing non-uniform splits across dimensions.

Usage

This module is called during boosting when processing multi-dimensional interaction terms with the tree-based splitting strategy. It is the most flexible multi-dimensional partitioner, suitable for capturing complex interaction patterns.

Code Reference

Source Location

Signature

template<bool bHessian, size_t cCompilerScores, size_t cCompilerDimensions>
static ErrorEbm MakeTensor(
    const size_t cRuntimeScores,
    const size_t cRuntimeRealDimensions,
    const TermBoostFlags flags,
    const Bin<FloatMain, UIntMain, true, true, bHessian, GetArrayScores(cCompilerScores)>* const aBins,
    const FloatCalc regAlpha,
    const FloatCalc regLambda,
    const FloatCalc deltaStepMax,
    double* const aTensorWeights,
    double* const aTensorGrad,
    double* const aTensorHess,
    const size_t cPossibleSplits,
    unsigned char** const aaSplits,
    TreeNodeMulti<bHessian, GetArrayScores(cCompilerScores)>* const pRootTreeNode,
    const size_t* const aiOriginalIndex,
    TensorSumDimension* const aDimensions,
    Bin<...>* const aAuxiliaryBins,
    Tensor* const pInnerTermUpdate
    ...
);

template<bool bHessian, size_t cCompilerScores>
class PartitionMultiDimensionalTreeInternal final {
public:
    static ErrorEbm Func(
        RandomDeterministic* const pRng,
        const size_t cRuntimeScores,
        const size_t cRealDimensions,
        const TermBoostFlags flags,
        const IntEbm* const aLeavesMax,
        const size_t cSamplesLeafMin,
        const FloatCalc hessianMin,
        const FloatCalc regAlpha,
        const FloatCalc regLambda,
        const FloatCalc deltaStepMax,
        const BinBase* const aBinsBase,
        BinBase* const aAuxiliaryBinsBase,
        Tensor* const pInnerTermUpdate,
        const size_t* const acBins,
        double* const aTensorWeights,
        double* const aTensorGrad,
        double* const aTensorHess,
        double* const pTotalGain
        ...
    );
};

I/O Contract

Inputs

Name Type Required Description
pRng RandomDeterministic* No Random number generator for purification
cRuntimeScores size_t Yes Number of score outputs
cRealDimensions size_t Yes Number of dimensions with more than 1 bin
flags TermBoostFlags Yes Boosting flags (DisableNewtonGain, DisableNewtonUpdate)
aLeavesMax const IntEbm* Yes Maximum number of leaves per dimension
cSamplesLeafMin size_t Yes Minimum samples per leaf
hessianMin FloatCalc Yes Minimum hessian sum per leaf
regAlpha FloatCalc Yes L1 regularization parameter
regLambda FloatCalc Yes L2 regularization parameter
deltaStepMax FloatCalc Yes Maximum update step size
aBinsBase const BinBase* Yes Pre-computed histogram bins
acBins const size_t* Yes Bin counts per dimension

Outputs

Name Type Description
return value ErrorEbm Error code (Error_None on success)
pInnerTermUpdate Tensor* Updated tensor containing tree-based split scores
aTensorWeights double* Tensor weights (for purification)
aTensorGrad double* Tensor gradients (for purification)
aTensorHess double* Tensor hessians (for purification)
pTotalGain double* Total gain achieved by the tree splits

Usage Examples

Pipeline Context

# This C++ module is called internally via the native bindings
# during boosting of multi-dimensional interaction terms
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier(interactions=10)
ebm.fit(X, y)  # Internally calls PartitionMultiDimensionalTree for pair boosting

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment