Workflow:SeldonIO Seldon core Inference Pipeline

Knowledge Sources	Seldon Core 2 Seldon Core 2 Docs Pipeline Documentation
Domains	MLOps, Data_Engineering, Kubernetes
Last Updated	2026-02-13 14:00 GMT

Overview

End-to-end process for composing multiple models into a directed acyclic graph (DAG) pipeline for multi-step inference in Seldon Core 2.

Description

This workflow covers creating inference pipelines that chain multiple models together using Seldon Core 2's Pipeline custom resource. Pipelines connect models via Kafka-based data flow, enabling patterns such as linear chains (model A feeds model B), parallel fan-out with joins, conditional routing based on model outputs, and pipeline-to-pipeline composition. Each step in the pipeline can optionally define tensor mappings to transform output names into the format expected by downstream steps. Pipelines support batch processing, trigger-based execution, and selective output exposure.

Usage

Execute this workflow when you need to compose multiple inference components into a single application endpoint. Common scenarios include preprocessing followed by prediction, ensemble models that aggregate multiple classifiers, conditional routing based on data content, and multi-modal pipelines that combine different types of models (e.g., speech-to-text followed by sentiment analysis).

Execution Steps

Step 1: Deploy Component Models

Deploy all individual models that will participate in the pipeline. Each model must be independently loaded and reach the Available state before the pipeline can become ready. Models can include classifiers, preprocessors, transformers, detectors, or any other inference component.

Key considerations:

All models referenced by the pipeline must be deployed before or concurrently with the pipeline
Models can be deployed on different inference servers (MLServer and Triton can coexist in the same pipeline)
Each model must have the correct requirements and memory allocation configured

Step 2: Define Pipeline Topology

Design the DAG structure of the pipeline by defining the step dependencies and data flow. Each step references a model name and declares its inputs (which can come from previous steps or from the pipeline-level input). The output section defines which step results are exposed in the pipeline response.

Key considerations:

Steps can reference other models by name in their inputs list
Use tensorMap to rename output tensors when downstream models expect different input names
Pipelines default to feeding the pipeline input to the first step if no inputs are specified
Output steps can select specific tensors (e.g., step_name.outputs.TENSOR_NAME)

Step 3: Configure Advanced Routing

Optionally add conditional logic, triggers, and joins to the pipeline. Triggers gate step execution based on upstream results, joins aggregate outputs from multiple parallel steps, and conditional models route data to different branches based on content.

Key considerations:

Triggers use a step's output as a gate condition (step only runs if trigger fires)
Joins can use any or all semantics (triggersJoinType field)
Conditional routing requires a model that produces multiple named outputs for different branches
Pipeline-to-pipeline composition allows one pipeline to reference another pipeline as a step

Step 4: Deploy Pipeline

Apply the Pipeline custom resource to the cluster. The Seldon scheduler validates the pipeline topology, creates the necessary Kafka topics for inter-step communication, and configures the data flow engine (chainer) to route data between steps.

Key considerations:

Pipeline names must not collide with any model name in the same namespace
Kafka topics are automatically created for each pipeline step
The pipeline status transitions through: PipelineCreating, PipelineReady

Step 5: Verify Pipeline Readiness

Wait for the pipeline to reach the PipelineReady condition. Check that all component models are loaded and the Kafka topic infrastructure is provisioned. Query pipeline metadata to verify the expected input/output schema.

Key considerations:

Pipeline readiness depends on all referenced models being available
Use pipeline metadata endpoint to verify the expected tensor names and shapes
Pipeline inspect command allows viewing data flowing through individual steps for debugging

Step 6: Run Pipeline Inference

Send inference requests to the pipeline endpoint. The request is routed through the pipeline steps according to the defined topology, with intermediate results flowing through Kafka topics. The final response contains the outputs defined in the pipeline's output section.

Key considerations:

Pipeline inference endpoint: /v2/pipelines/{pipeline_name}/infer
Input format must match the first step's expected input schema
Pipeline response latency includes all step execution times plus Kafka overhead
Both REST and gRPC are supported for pipeline inference

Execution Diagram

GitHub URL

Workflow Repository