Implementation:SeldonIO Seldon core Seldon Model CRD HuggingFace

Field	Value
Type	Pattern Doc
Overview	Concrete pattern for declaring HuggingFace models as Seldon Core 2 Model resources.
Source	`samples/models/hf-sentiment.yaml:L1-8`, `samples/models/hf-text-gen.yaml:L1-8`, `samples/models/hf-whisper.yaml:L1-8`
Domains	MLOps, NLP, Kubernetes
Implements Principle	SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition
External Dependencies	Kubernetes API (mlops.seldon.io/v1alpha1), MLServer HuggingFace runtime
Knowledge Sources	Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated	2026-02-13 00:00 GMT

Code Reference

Sentiment Analysis Model

# hf-sentiment.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: sentiment
spec:
  storageUri: "gs://seldon-models/mlserver/huggingface/sentiment"
  requirements:
  - huggingface

Text Generation Model

# hf-text-gen.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: text-gen
spec:
  storageUri: "gs://seldon-models/mlserver/huggingface/text-gen"
  requirements:
  - huggingface

Speech-to-Text Model (Whisper)

# hf-whisper.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: whisper
spec:
  storageUri: "gs://seldon-models/mlserver/huggingface/whisper"
  requirements:
  - huggingface

Key Parameters

Parameter	Example	Description
metadata.name	`sentiment`, `text-gen`, `whisper`	Unique model name used for routing inference requests
spec.storageUri	`gs://seldon-models/mlserver/huggingface/sentiment`	Remote URI pointing to the serialized HuggingFace pipeline artifacts
spec.requirements	`["huggingface"]`	Capability selector; routes model to a server with the HuggingFace MLServer runtime
spec.memory	`"3Gi"` (optional)	Memory allocation for larger models (e.g., Whisper); used by scheduler for resource planning

I/O Contract

Inputs

Input	Format	Description
HuggingFace model artifacts	Remote URI (GCS, S3, etc.)	Serialized HuggingFace pipeline directory produced by `save_pretrained()`

Outputs

Output	Format	Description
Kubernetes Model resource	YAML manifest	Model CRD targeting the MLServer HuggingFace runtime via the `huggingface` requirement

Usage Examples

Applying a single model

kubectl apply -f samples/models/hf-sentiment.yaml

Applying all HuggingFace models

kubectl apply -f samples/models/hf-sentiment.yaml
kubectl apply -f samples/models/hf-text-gen.yaml
kubectl apply -f samples/models/hf-whisper.yaml

Using the seldon CLI

seldon model load -f samples/models/hf-sentiment.yaml
seldon model load -f samples/models/hf-text-gen.yaml
seldon model load -f samples/models/hf-whisper.yaml

Specifying memory for large models

For models that require more memory (e.g., Whisper), add the memory field:

apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: whisper
spec:
  storageUri: "gs://seldon-models/mlserver/huggingface/whisper"
  requirements:
  - huggingface
  memory: "3Gi"

Related Pages

SeldonIO_Seldon_core_HuggingFace_Model_Resource_Definition -- principle that this implementation realizes
SeldonIO_Seldon_core_Transformers_Pipeline_Save_Pretrained -- depends on serialized artifacts produced by the preparation step
SeldonIO_Seldon_core_Seldon_Model_Load_HuggingFace -- consumed by the deployment step that loads these model definitions
SeldonIO_Seldon_core_Seldon_Model_CRD -- specializes the general Model CRD pattern for HuggingFace models
Environment:SeldonIO_Seldon_core_GPU_Inference_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment