Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Kserve Kserve InferenceGraph Specification

From Leeroopedia
Knowledge Sources
Domains Pipeline, Kubernetes, Model_Serving
Last Updated 2026-02-13 00:00 GMT

Overview

A declarative specification for defining multi-model inference pipelines as a directed graph of routing nodes and inference steps.

Description

The InferenceGraph Specification defines the CRD that allows users to compose multiple InferenceServices into a single inference pipeline. The spec contains:

  • Nodes: A map of named routing nodes, each with a routerType and list of steps. A root node is mandatory.
  • Steps: Each step targets either a serviceName (InferenceService), nodeName (another graph node), or serviceURL (external endpoint).
  • Data forwarding: Steps can forward $request (original) or $response (previous step output).
  • Conditions: Switch nodes use GJSON expressions for conditional routing.
  • Weights: Splitter nodes use integer weights that must sum to 100.

Usage

Use this when you need to compose multiple models into a pipeline. The InferenceGraph is the preferred approach over manual service chaining because it provides:

  • Declarative composition
  • Built-in routing logic
  • Automatic Knative/Deployment management for the router pod
  • Single entry point for the entire pipeline

Theoretical Basis

# Graph specification model (NOT implementation code)
InferenceGraph:
  nodes:
    root:                  # MANDATORY entry node
      routerType: <type>
      steps:
        - name: step1
          serviceName: isvc-1    # Route to InferenceService
          data: "$request"       # Forward original request
        - name: step2
          nodeName: subgraph     # Route to another node
          data: "$response"      # Forward previous step output

Validation rules:
  - "root" node must exist
  - Step names must be unique within a node
  - Each step must have exactly one target (serviceName XOR nodeName XOR serviceURL)
  - Splitter weights must sum to 100
  - No cycles (DAG constraint)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment