Implementation:Interpretml Interpret TensorTotalsSum
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, EBM_Core |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
TensorTotalsSum is a C++ header module that computes the sum of histogram bin statistics over an arbitrary sub-region of a multi-dimensional prefix-sum tensor using inclusion-exclusion.
Description
This header provides the TensorTotalsSum function (and its specializations) which, given a prefix-sum tensor built by TensorTotalsBuild, efficiently computes the total sample count, weight, gradient sum, and hessian sum for any rectangular sub-region specified by low and high bounds along each dimension.
Key components include:
TensorSumDimensionstruct: Defines the query region for each dimension with fieldsm_iLow(inclusive lower bound),m_iHigh(exclusive upper bound), andm_cBins(total bin count).
TensorTotalsSumMulti: The general N-dimensional implementation. It uses inclusion-exclusion over 2^D corners, iterating through all bit patterns to alternate between addition and subtraction of corner values. Dimensions wherem_iLow == 0are optimized away since they don't require subtraction, reducing the effective dimensionality.
TensorTotalsSumPair: Specialization for 2-dimensional queries (currently delegates to Multi, with a TODO for direct 4-bin lookup optimization).
TensorTotalsSumTripple: Specialization for 3-dimensional queries (currently delegates to Multi).
TensorTotalsSum: Dispatch function that routes to the appropriate specialization based on compile-time or runtime dimension count.
- Debug verification functions (
TensorTotalsSumDebugSlow,TensorTotalsCompareDebug): Brute-force implementations that iterate over all bins in the region for validation in debug builds.
The algorithm achieves O(2^D) time complexity for any sub-region query, where D is the number of dimensions that have non-zero lower bounds.
Usage
This module is called during split finding in both boosting and interaction detection. Every candidate split evaluation requires computing bin statistics for the resulting sub-regions, and TensorTotalsSum provides the efficient mechanism for these queries.
Code Reference
Source Location
- Repository: Interpretml_Interpret
- File:
shared/libebm/TensorTotalsSum.hpp
Signature
struct TensorSumDimension {
size_t m_iLow;
size_t m_iHigh;
size_t m_cBins;
};
template<bool bHessian, size_t cCompilerScores, size_t cCompilerDimensions>
static void TensorTotalsSum(
const size_t cRuntimeScores,
const size_t cRuntimeDimensions,
const TensorSumDimension* const aDimensions,
const Bin<FloatMain, UIntMain, true, true, bHessian, GetArrayScores(cCompilerScores)>* const aBins,
Bin<FloatMain, UIntMain, true, true, bHessian, GetArrayScores(cCompilerScores)>& binOut,
GradientPair<FloatMain, bHessian>* const aGradientPairsOut
...
);
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cRuntimeScores | size_t | Yes | Number of score outputs |
| cRuntimeDimensions | size_t | Yes | Number of dimensions |
| aDimensions | const TensorSumDimension* | Yes | Array defining the query region per dimension (low, high, cBins) |
| aBins | const Bin* | Yes | Prefix-sum tensor (built by TensorTotalsBuild) |
Outputs
| Name | Type | Description |
|---|---|---|
| binOut | Bin& | Output bin containing summed count, weight, and gradient statistics |
| aGradientPairsOut | GradientPair* | Output gradient pairs for the queried region |
Usage Examples
Pipeline Context
# This C++ module is called internally via the native bindings
# during split evaluation to compute sub-region statistics
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier(interactions=10)
ebm.fit(X, y) # Internally calls TensorTotalsSum for every candidate split