Implementation:SeldonIO Seldon core Seldon Model CRD HuggingFace
Appearance
| Field | Value |
|---|---|
| Type | Pattern Doc |
| Overview | Concrete pattern for declaring HuggingFace models as Seldon Core 2 Model resources. |
| Source | samples/models/hf-sentiment.yaml:L1-8, samples/models/hf-text-gen.yaml:L1-8, samples/models/hf-whisper.yaml:L1-8
|
| Domains | MLOps, NLP, Kubernetes |
| Implements Principle | SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition |
| External Dependencies | Kubernetes API (mlops.seldon.io/v1alpha1), MLServer HuggingFace runtime |
| Knowledge Sources | Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/) |
| Last Updated | 2026-02-13 00:00 GMT |
Code Reference
Sentiment Analysis Model
# hf-sentiment.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: sentiment
spec:
storageUri: "gs://seldon-models/mlserver/huggingface/sentiment"
requirements:
- huggingface
Text Generation Model
# hf-text-gen.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: text-gen
spec:
storageUri: "gs://seldon-models/mlserver/huggingface/text-gen"
requirements:
- huggingface
Speech-to-Text Model (Whisper)
# hf-whisper.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: whisper
spec:
storageUri: "gs://seldon-models/mlserver/huggingface/whisper"
requirements:
- huggingface
Key Parameters
| Parameter | Example | Description |
|---|---|---|
| metadata.name | sentiment, text-gen, whisper |
Unique model name used for routing inference requests |
| spec.storageUri | gs://seldon-models/mlserver/huggingface/sentiment |
Remote URI pointing to the serialized HuggingFace pipeline artifacts |
| spec.requirements | ["huggingface"] |
Capability selector; routes model to a server with the HuggingFace MLServer runtime |
| spec.memory | "3Gi" (optional) |
Memory allocation for larger models (e.g., Whisper); used by scheduler for resource planning |
I/O Contract
Inputs
| Input | Format | Description |
|---|---|---|
| HuggingFace model artifacts | Remote URI (GCS, S3, etc.) | Serialized HuggingFace pipeline directory produced by save_pretrained()
|
Outputs
| Output | Format | Description |
|---|---|---|
| Kubernetes Model resource | YAML manifest | Model CRD targeting the MLServer HuggingFace runtime via the huggingface requirement
|
Usage Examples
Applying a single model
kubectl apply -f samples/models/hf-sentiment.yaml
Applying all HuggingFace models
kubectl apply -f samples/models/hf-sentiment.yaml
kubectl apply -f samples/models/hf-text-gen.yaml
kubectl apply -f samples/models/hf-whisper.yaml
Using the seldon CLI
seldon model load -f samples/models/hf-sentiment.yaml
seldon model load -f samples/models/hf-text-gen.yaml
seldon model load -f samples/models/hf-whisper.yaml
Specifying memory for large models
For models that require more memory (e.g., Whisper), add the memory field:
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: whisper
spec:
storageUri: "gs://seldon-models/mlserver/huggingface/whisper"
requirements:
- huggingface
memory: "3Gi"
Related Pages
- SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition -- principle that this implementation realizes
- SeldonIO_Seldon_core_Transformers_Pipeline_Save_Pretrained -- depends on serialized artifacts produced by the preparation step
- SeldonIO_Seldon_core_Seldon_Model_Load_HuggingFace -- consumed by the deployment step that loads these model definitions
- SeldonIO_Seldon_core_Seldon_Model_CRD -- specializes the general Model CRD pattern for HuggingFace models
- Environment:SeldonIO_Seldon_core_GPU_Inference_Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment