Principle:Kserve Kserve InferenceGraph Specification

Knowledge Sources	KServe InferenceGraph Kubernetes CRD
Domains	Pipeline, Kubernetes, Model_Serving
Last Updated	2026-02-13 00:00 GMT

Overview

A declarative specification for defining multi-model inference pipelines as a directed graph of routing nodes and inference steps.

Description

The InferenceGraph Specification defines the CRD that allows users to compose multiple InferenceServices into a single inference pipeline. The spec contains:

Nodes: A map of named routing nodes, each with a routerType and list of steps. A root node is mandatory.
Steps: Each step targets either a serviceName (InferenceService), nodeName (another graph node), or serviceURL (external endpoint).
Data forwarding: Steps can forward $request (original) or $response (previous step output).
Conditions: Switch nodes use GJSON expressions for conditional routing.
Weights: Splitter nodes use integer weights that must sum to 100.

Usage

Use this when you need to compose multiple models into a pipeline. The InferenceGraph is the preferred approach over manual service chaining because it provides:

Declarative composition
Built-in routing logic
Automatic Knative/Deployment management for the router pod
Single entry point for the entire pipeline

Theoretical Basis

# Graph specification model (NOT implementation code)
InferenceGraph:
  nodes:
    root:                  # MANDATORY entry node
      routerType: <type>
      steps:
        - name: step1
          serviceName: isvc-1    # Route to InferenceService
          data: "$request"       # Forward original request
        - name: step2
          nodeName: subgraph     # Route to another node
          data: "$response"      # Forward previous step output

Validation rules:
  - "root" node must exist
  - Step names must be unique within a node
  - Each step must have exactly one target (serviceName XOR nodeName XOR serviceURL)
  - Splitter weights must sum to 100
  - No cycles (DAG constraint)

Related Pages

Implemented By

Implementation:Kserve_Kserve_InferenceGraph_CRD_Spec

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment