Principle:SeldonIO Seldon core Explainer Model Training

Field	Value
Overview	Training model-agnostic explanation models that provide interpretable predictions for black-box classifiers.
Domains	Explainability, MLOps
Workflow	Model_Explainability
Related Implementation	SeldonIO_Seldon_core_Alibi_Explainer_Training
Last Updated	2026-02-13 00:00 GMT

Description

Alibi Explain provides several explainability algorithms: AnchorTabular for tabular data (finds minimal feature subsets that guarantee a prediction), AnchorText for text data (finds minimal word sets), and KernelShap for feature importance scores. Each explainer wraps a predictor function and is fitted on training data to learn the data distribution.

The training process involves:

AnchorTabular: Learns discretization bins from continuous features and maps categorical features to their category names. The explainer is fitted on training data with configurable percentile boundaries for discretization.
AnchorText: Wraps a text classifier's predict function and uses a spaCy NLP model for word-level perturbation. The sampling strategy (e.g., unknown) determines how replacement words are chosen.
KernelShap: Wraps a model's decision function and fits on background training data used to compute marginal expectations for missing features.

Each explainer is serialized via explainer.save(dirname) to produce artifacts that can be deployed on MLServer with the Alibi-Explain runtime.

Theoretical Basis

Anchor explanations find minimal sufficient conditions (IF-THEN rules) that "anchor" a prediction: if the anchor features hold, the prediction is guaranteed with high probability. Formally, an anchor A is a rule such that:

P(f(x) = f(z) | A ∈ z) ≥ τ

for samples z satisfying A, where τ is a precision threshold. The algorithm uses a beam search to iteratively build candidate anchors, evaluating each via Monte Carlo sampling until the precision exceeds the threshold.

KernelShap approximates Shapley values using a weighted linear regression on perturbed inputs. Shapley values decompose the model output into additive contributions from each feature, providing a theoretically grounded measure of feature importance.

Mathematical Formulation

Anchor precision: P(f(x) = f(z) | A) ≥ τ (default τ = 0.95)
Anchor coverage: fraction of instances where anchor applies
KernelShap: φ_i = Σ_S |S|!(M-|S|-1)!/M! [f(S∪{i}) - f(S)]

where M is the total number of features, S is a subset of features not containing i, and f(S) is the expected model output when only features in S are present.

Usage

When creating explanation artifacts for deployment alongside classifiers in Seldon Core 2. The trained explainer models are serialized and uploaded to a storage URI (e.g., GCS bucket), then referenced by a Model CRD with an explainer section for deployment.

Knowledge Sources

Related Pages

SeldonIO_Seldon_core_Alibi_Explainer_Training - implements this principle - Concrete tools for training model explainers provided by the alibi library.
SeldonIO_Seldon_core_Explainer_Model_Deployment - related principle - Deploying the trained explainer models in Seldon Core 2.
SeldonIO_Seldon_core_Explanation_Generation - related principle - Generating explanations from deployed explainer models.

Implementation:SeldonIO_Seldon_core_Alibi_Explainer_Training

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment