Implementation:Interpretml Interpret ConvertAddBin
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, EBM_Core |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
ConvertAddBin is a C++ module that converts and accumulates bin data between different numeric representations (float/double, uint32/uint64) during the EBM boosting process.
Description
This module implements a single function, ConvertAddBin, which handles the conversion and additive merging of histogram bin data across different floating-point and unsigned integer type combinations. The function supports all permutations of source and destination bin formats, including combinations of float/double for floating-point data and uint32_t/uint64_t for count data. It handles optional fields including sample counts, weights, gradients, and hessians.
The function uses C++ template specializations of the Bin class to determine memory layout offsets at runtime based on boolean flags, then performs byte-level pointer arithmetic to read source bins and accumulate values into destination bins. This design allows a single non-templated function to handle all type combinations without code bloat.
Usage
This module is called during boosting when bin statistics computed on different data subsets (potentially using different numeric precisions, such as SIMD vs CPU subsets) need to be merged together. It serves as the final aggregation step where gradient and hessian sums from multiple sources are combined, and pre-computed sample counts and weights can optionally be applied.
Code Reference
Source Location
- Repository: Interpretml_Interpret
- File:
shared/libebm/ConvertAddBin.cpp
Signature
extern void ConvertAddBin(
const size_t cScores,
const bool bHessian,
const size_t cBins,
const bool bUInt64Src,
const bool bDoubleSrc,
const bool bCountSrc,
const bool bWeightSrc,
const void* const aSrc,
const UIntMain* const aCounts,
const FloatPrecomp* const aWeights,
const bool bUInt64Dest,
const bool bDoubleDest,
void* const aAddDest);
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cScores | size_t | Yes | Number of scores (classes) per bin |
| bHessian | bool | Yes | Whether hessian data is included in the bins |
| cBins | size_t | Yes | Number of bins to process |
| bUInt64Src | bool | Yes | Whether source uses uint64_t for integer fields |
| bDoubleSrc | bool | Yes | Whether source uses double for float fields |
| bCountSrc | bool | Yes | Whether source bins contain sample count fields |
| bWeightSrc | bool | Yes | Whether source bins contain weight fields |
| aSrc | const void* | Yes | Pointer to source bin array |
| aCounts | const UIntMain* | No | Pre-computed counts to assign (overrides source counts) |
| aWeights | const FloatPrecomp* | No | Pre-computed weights to assign (overrides source weights) |
| bUInt64Dest | bool | Yes | Whether destination uses uint64_t for integer fields |
| bDoubleDest | bool | Yes | Whether destination uses double for float fields |
| aAddDest | void* | Yes | Pointer to destination bin array (values are added in-place) |
Outputs
| Name | Type | Description |
|---|---|---|
| aAddDest | void* (in-place) | Destination bin array with accumulated values from source bins |
Usage Examples
Pipeline Context
# This C++ module is called internally via the native bindings
# during the boosting phase when merging bin statistics from
# multiple data subsets (e.g., SIMD and CPU subsets)
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X, y) # Internally calls ConvertAddBin during boosting