Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Interpretml Interpret Gradient Histogram Bin

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, EBM_Core
Last Updated 2026-02-07 12:00 GMT

Overview

Defines the templated Bin data structure used to accumulate gradient and hessian statistics during the EBM boosting and interaction detection processes.

Description

The Bin.hpp header provides the core histogram bin data structures for the EBM native library. It defines a hierarchy of bin types: BinBase as the non-templated base, BinData for the actual data storage (with template specializations for different combinations of sample count, weight, and hessian tracking), and Bin as the fully specialized type used throughout the codebase. The bins are POD (Plain Old Data) types that use only malloc/free for memory management, and the struct hack pattern (via ArrayToPointer) for co-located gradient pair storage. Bins accumulate per-feature statistics including sample counts, weights, and gradient/hessian pairs needed for tree-based split finding.

Usage

Used internally during boosting and interaction detection to accumulate per-bin gradient and hessian statistics. BinSumsBoosting and BinSumsInteraction populate these bins, and the partition algorithms (e.g., PartitionMultiDimensionalFull) read from them to compute gain and find optimal splits.

Code Reference

Source Location

Signature

template<typename TFloat, typename TUInt, bool bCount, bool bWeight, bool bHessian, size_t cCompilerScores = 1>
struct Bin;

struct BinBase {
   template<typename TFloat, typename TUInt, bool bCount, bool bWeight, bool bHessian, size_t cCompilerScores = 1>
   GPU_BOTH inline Bin<TFloat, TUInt, bCount, bWeight, bHessian, cCompilerScores>* Specialize();

   GPU_BOTH inline void ZeroMem(const size_t cBytesPerBin, const size_t cBins = 1, const size_t iBin = 0);
};

template<typename TFloat, typename TUInt, bool bHessian, size_t cCompilerScores>
struct BinData<TFloat, TUInt, true, true, bHessian, cCompilerScores> : BinBase {
   GPU_BOTH inline TUInt GetCountSamples() const;
   GPU_BOTH inline void SetCountSamples(const TUInt cSamples);
   GPU_BOTH inline TFloat GetWeight() const;
   GPU_BOTH inline void SetWeight(const TFloat weight);
};

template<typename TFloat, typename TUInt>
static bool IsOverflowBinSize(const bool bCount, const bool bWeight, const bool bHessian, const size_t cScores);

template<typename TFloat, typename TUInt>
GPU_BOTH inline constexpr static size_t GetBinSize(
      const bool bCount, const bool bWeight, const bool bHessian, const size_t cScores);

I/O Contract

Template Parameter Description
TFloat Floating point type (float or double, or SIMD pack)
TUInt Unsigned integer type for sample counts
bCount Whether to track sample counts per bin
bWeight Whether to track weight per bin
bHessian Whether to track hessians in addition to gradients
cCompilerScores Number of scores (1 for binary, k for multiclass)
Output Description
GradientPairs Accumulated gradient (and optionally hessian) per score
CountSamples Number of samples in bin (when bCount is true)
Weight Accumulated weight (when bWeight is true)

Usage Examples

# Called internally via native bindings
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X, y)  # Internally bins are used during boosting rounds

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment