Principle:SeldonIO Seldon core Explainer Model Training
| Field | Value |
|---|---|
| Overview | Training model-agnostic explanation models that provide interpretable predictions for black-box classifiers. |
| Domains | Explainability, MLOps |
| Workflow | Model_Explainability |
| Related Implementation | SeldonIO_Seldon_core_Alibi_Explainer_Training |
| Last Updated | 2026-02-13 00:00 GMT |
Description
Alibi Explain provides several explainability algorithms: AnchorTabular for tabular data (finds minimal feature subsets that guarantee a prediction), AnchorText for text data (finds minimal word sets), and KernelShap for feature importance scores. Each explainer wraps a predictor function and is fitted on training data to learn the data distribution.
The training process involves:
- AnchorTabular: Learns discretization bins from continuous features and maps categorical features to their category names. The explainer is fitted on training data with configurable percentile boundaries for discretization.
- AnchorText: Wraps a text classifier's predict function and uses a spaCy NLP model for word-level perturbation. The sampling strategy (e.g., unknown) determines how replacement words are chosen.
- KernelShap: Wraps a model's decision function and fits on background training data used to compute marginal expectations for missing features.
Each explainer is serialized via explainer.save(dirname) to produce artifacts that can be deployed on MLServer with the Alibi-Explain runtime.
Theoretical Basis
Anchor explanations find minimal sufficient conditions (IF-THEN rules) that "anchor" a prediction: if the anchor features hold, the prediction is guaranteed with high probability. Formally, an anchor A is a rule such that:
P(f(x) = f(z) | A ∈ z) ≥ τ
for samples z satisfying A, where τ is a precision threshold. The algorithm uses a beam search to iteratively build candidate anchors, evaluating each via Monte Carlo sampling until the precision exceeds the threshold.
KernelShap approximates Shapley values using a weighted linear regression on perturbed inputs. Shapley values decompose the model output into additive contributions from each feature, providing a theoretically grounded measure of feature importance.
Mathematical Formulation
- Anchor precision:
P(f(x) = f(z) | A) ≥ τ(default τ = 0.95) - Anchor coverage: fraction of instances where anchor applies
- KernelShap:
φ_i = Σ_S |S|!(M-|S|-1)!/M! [f(S∪{i}) - f(S)]
where M is the total number of features, S is a subset of features not containing i, and f(S) is the expected model output when only features in S are present.
Usage
When creating explanation artifacts for deployment alongside classifiers in Seldon Core 2. The trained explainer models are serialized and uploaded to a storage URI (e.g., GCS bucket), then referenced by a Model CRD with an explainer section for deployment.
Knowledge Sources
- Paper: Anchors: High-Precision Model-Agnostic Explanations
- Paper: A Unified Approach to Interpreting Model Predictions (SHAP)
- Doc: Alibi Explain Documentation
Related Pages
- SeldonIO_Seldon_core_Alibi_Explainer_Training - implements this principle - Concrete tools for training model explainers provided by the alibi library.
- SeldonIO_Seldon_core_Explainer_Model_Deployment - related principle - Deploying the trained explainer models in Seldon Core 2.
- SeldonIO_Seldon_core_Explanation_Generation - related principle - Generating explanations from deployed explainer models.
Implementation:SeldonIO_Seldon_core_Alibi_Explainer_Training