Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:SeldonIO Seldon core Model Resource Definition

From Leeroopedia
Revision as of 17:42, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/SeldonIO_Seldon_core_Model_Resource_Definition.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Property Value
Principle Name Model_Resource_Definition
Overview Declarative specification of ML model resources using Kubernetes Custom Resource Definitions.
Workflow Model_Deployment
Domains MLOps, Kubernetes
Related Implementation SeldonIO_Seldon_core_Seldon_Model_CRD
Last Updated 2026-02-13 00:00 GMT

Description

Seldon Core 2 uses a Model CRD (apiVersion: mlops.seldon.io/v1alpha1, kind: Model) to declare model artifacts with storage URIs, runtime requirements, and memory allocations. The scheduler then assigns models to matching inference servers. This declarative approach means that operators specify what model they want deployed rather than how to deploy it, and the Seldon Core 2 control plane handles the orchestration.

The Model CRD captures several critical pieces of information:

  • metadata.name: A unique identifier for the model within the namespace
  • spec.storageUri: The location of the model artifact (GCS, S3, MinIO, or local paths)
  • spec.requirements: A list of runtime capability tags (e.g., sklearn, tensorflow, huggingface) that must match a Server's capabilities
  • spec.memory: Optional memory allocation hint for the scheduler (e.g., "100Ki")

The scheduler uses the requirements list to find a compatible inference Server. For example, a model with requirements: ["sklearn"] will be assigned to a Server that has the sklearn capability, typically an MLServer instance with the scikit-learn runtime installed.

Theoretical Basis

Kubernetes Custom Resource Definitions (CRDs) extend the Kubernetes API with domain-specific resources. The Model CRD declaratively captures what model to load, from where, and with what runtime constraints. This follows the Kubernetes operator pattern where:

  1. Desired state is expressed as a CRD manifest (the Model resource)
  2. Actual state is tracked by the controller (model loaded on a specific Server)
  3. Reconciliation continuously drives actual state toward desired state

The Model CRD abstracts away infrastructure concerns from ML engineers. Instead of manually configuring inference servers, mounting volumes, and managing processes, users declare their intent through a simple YAML manifest. The Seldon scheduler then handles:

  • Server selection: Matching model requirements to Server capabilities
  • Artifact retrieval: Downloading model files from remote storage via rclone
  • Runtime loading: Invoking the appropriate MLServer or Triton runtime to load the model
  • Capacity planning: Respecting memory constraints and server overcommit ratios

This separation of concerns enables platform teams to manage infrastructure (Servers, storage, networking) independently from ML teams who focus on model definitions.

Usage

This principle applies when defining any model for deployment on Seldon Core 2, regardless of framework (sklearn, TensorFlow, HuggingFace, etc.). The typical workflow is:

  1. Prepare the model artifact and upload it to a storage backend
  2. Write a Model CRD YAML specifying the storageUri and requirements
  3. Apply the manifest to the Kubernetes cluster
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: iris
spec:
  storageUri: "gs://seldon-models/mlserver/iris"
  requirements:
  - sklearn
  memory: 100Ki

Models can also specify additional fields such as spec.server to pin to a specific Server, or spec.explainer to attach model explanations.

Related Pages

Implementation:SeldonIO_Seldon_core_Seldon_Model_CRD

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment