Principle:Kserve Kserve InferenceGraph Specification
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Pipeline, Kubernetes, Model_Serving |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
A declarative specification for defining multi-model inference pipelines as a directed graph of routing nodes and inference steps.
Description
The InferenceGraph Specification defines the CRD that allows users to compose multiple InferenceServices into a single inference pipeline. The spec contains:
- Nodes: A map of named routing nodes, each with a
routerTypeand list of steps. Arootnode is mandatory. - Steps: Each step targets either a
serviceName(InferenceService),nodeName(another graph node), orserviceURL(external endpoint). - Data forwarding: Steps can forward
$request(original) or$response(previous step output). - Conditions: Switch nodes use GJSON expressions for conditional routing.
- Weights: Splitter nodes use integer weights that must sum to 100.
Usage
Use this when you need to compose multiple models into a pipeline. The InferenceGraph is the preferred approach over manual service chaining because it provides:
- Declarative composition
- Built-in routing logic
- Automatic Knative/Deployment management for the router pod
- Single entry point for the entire pipeline
Theoretical Basis
# Graph specification model (NOT implementation code)
InferenceGraph:
nodes:
root: # MANDATORY entry node
routerType: <type>
steps:
- name: step1
serviceName: isvc-1 # Route to InferenceService
data: "$request" # Forward original request
- name: step2
nodeName: subgraph # Route to another node
data: "$response" # Forward previous step output
Validation rules:
- "root" node must exist
- Step names must be unique within a node
- Each step must have exactly one target (serviceName XOR nodeName XOR serviceURL)
- Splitter weights must sum to 100
- No cycles (DAG constraint)
Related Pages
Implemented By
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment