Implementation:Interpretml Interpret InterpretableNumerics
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, EBM_Core |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
InterpretableNumerics is a C++ module that provides human-readable cut point generation and numerically safe mathematical operations for the EBM framework.
Description
This module implements two major categories of functionality:
Interpretable Cut Point Generation:
The core innovation is selecting cut points between adjacent feature values that are human-readable (e.g., "2.5" rather than "2.4999999999999996"). Key functions include:
ArithmeticMean: Computes the arithmetic mean of two doubles using the overflow-safe formulalow * 0.5 + high * 0.5, with careful handling of IEEE 754 edge cases.GeometricMeanPositives: Computes the geometric mean of two positive doubles usingexp((log(low) + log(high)) * 0.5), falling back to arithmetic mean if numerical issues arise.FloatToFullString: Converts a double to its full scientific notation string representation in a standardized format for cross-platform reproducibility.GetInterpretableCutPointFloat: The central function that finds a human-interpretable cut point between two adjacent values. It progressively shortens the string representation of the midpoint, choosing the shortest string that still falls between the two values.GetInterpretableEndpoint: Finds interpretable graph boundary endpoints by adjusting from the center point toward the edges.RemoveMissingValsAndReplaceInfinities: Preprocesses feature values by removing NaN and replacing infinities with min/max representable values.SuggestGraphBounds: EBM API function that suggests graph display boundaries for feature visualizations.GetHistogramCutCount: Determines the optimal number of histogram bins for a given feature.
Numerically Safe Operations:
SafeSum: Computes the sum of an array with overflow protection.SafeMean: Computes the mean with safe summation.SafeStandardDeviation: Computes standard deviation with safe summation using the two-pass algorithm.SafeExp: Applies exponential function with clamping to prevent overflow.SafeLog: Applies logarithm with clamping to prevent negative infinity.
Usage
The interpretable cut point functions are called during feature discretization to produce human-readable bin boundaries. The safe math functions are available through the public API for use by higher-level Python/R code when processing model outputs.
Code Reference
Source Location
- Repository: Interpretml_Interpret
- File:
shared/libebm/interpretable_numerics.cpp
Signature
// Interpretable cut points
extern double ArithmeticMean(const double low, const double high) noexcept;
extern double GetInterpretableCutPointFloat(
double low, double high) noexcept;
extern double GetInterpretableEndpoint(
const double center, const double movementFromEnds) noexcept;
extern size_t RemoveMissingValsAndReplaceInfinities(
const size_t cSamples, double* const aVals) noexcept;
// Float/String conversion
extern IntEbm GetCountCharactersPerFloat();
extern ErrorEbm FloatsToString(IntEbm count, const double* vals, char* str);
extern ErrorEbm StringToFloats(const char* str, double* vals);
// Public API functions
EBM_API_BODY ErrorEbm EBM_CALLING_CONVENTION SuggestGraphBounds(
IntEbm countCuts, const double* cuts,
double minFeatureVal, double maxFeatureVal,
double* lowGraphBoundOut, double* highGraphBoundOut);
EBM_API_BODY ErrorEbm EBM_CALLING_CONVENTION SafeSum(
IntEbm count, const double* vals, double* sumOut);
EBM_API_BODY ErrorEbm EBM_CALLING_CONVENTION SafeMean(
IntEbm count, const double* vals, double* meanOut);
EBM_API_BODY ErrorEbm EBM_CALLING_CONVENTION SafeStandardDeviation(
IntEbm count, const double* vals, double* stddevOut);
EBM_API_BODY void EBM_CALLING_CONVENTION SafeExp(
IntEbm count, double* inout);
EBM_API_BODY void EBM_CALLING_CONVENTION SafeLog(
IntEbm count, double* inout);
EBM_API_BODY IntEbm EBM_CALLING_CONVENTION GetHistogramCutCount(
IntEbm countSamples, const double* featureVals);
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| low | double | Yes | Lower bound value for cut point calculation |
| high | double | Yes | Upper bound value for cut point calculation |
| count | IntEbm | Yes | Number of values for safe math operations |
| vals | const double* | Yes | Array of values for safe math operations |
| countCuts | IntEbm | Yes | Number of cut points for graph bounds |
| cuts | const double* | Yes | Cut point array for graph bounds |
Outputs
| Name | Type | Description |
|---|---|---|
| return (ArithmeticMean) | double | Midpoint between low and high |
| return (GetInterpretableCutPointFloat) | double | Human-readable cut point between low and high |
| sumOut / meanOut / stddevOut | double* | Safe numerical computation results |
| lowGraphBoundOut / highGraphBoundOut | double* | Suggested graph display boundaries |
Usage Examples
Pipeline Context
# This C++ module is called internally via the native bindings
# during discretization for human-readable cut points
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X, y) # Internally calls GetInterpretableCutPointFloat for bin edges