Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Kserve Kserve InferenceGraph CRD Spec

From Leeroopedia
Knowledge Sources
Domains Pipeline, Kubernetes, Model_Serving
Last Updated 2026-02-13 00:00 GMT

Overview

Concrete Go type definitions for the InferenceGraph CRD, including InferenceGraphSpec, InferenceRouter, and InferenceStep types.

Description

The InferenceGraph CRD is defined in pkg/apis/serving/v1alpha1/inference_graph.go. The spec contains a map of named InferenceRouter nodes. Each router has a RouterType and list of InferenceStep entries. Steps target InferenceServices (by name), other graph nodes, or external URLs. Validation enforces a root node, unique step names, and splitter weight constraints.

Usage

Write InferenceGraph YAML manifests referencing running InferenceService components by their serviceName.

Code Reference

Source Location

  • Repository: kserve
  • File: pkg/apis/serving/v1alpha1/inference_graph.go, Lines 35-305
  • File: pkg/apis/serving/v1alpha1/inference_graph_validation.go, Lines 109-215

Signature

// InferenceGraphSpec defines the graph structure
type InferenceGraphSpec struct {
    Nodes          map[string]InferenceRouter `json:"nodes"`
    Resources      corev1.ResourceRequirements `json:"resources,omitempty"`
    TimeoutSeconds *int64 `json:"timeoutSeconds,omitempty"`
    MinReplicas    *int32 `json:"minReplicas,omitempty"`
    MaxReplicas    int32  `json:"maxReplicas,omitempty"`
}

// InferenceRouter defines a node in the graph
type InferenceRouter struct {
    RouterType InferenceRouterType `json:"routerType"`
    Steps      []InferenceStep     `json:"steps"`
}

// InferenceStep defines a step within a router node
type InferenceStep struct {
    StepName  string `json:"name,omitempty"`
    InferenceTarget `json:",inline"`
    Data       string          `json:"data,omitempty"`
    Weight     *int64          `json:"weight,omitempty"`
    Condition  string          `json:"condition,omitempty"`
    Dependency InferenceDependency `json:"dependency,omitempty"`
}

// InferenceTarget specifies the step's target (exactly one)
type InferenceTarget struct {
    NodeName   string `json:"nodeName,omitempty"`
    ServiceName string `json:"serviceName,omitempty"`
    ServiceURL string `json:"serviceURL,omitempty"`
}

Import

import servingv1alpha1 "github.com/kserve/kserve/pkg/apis/serving/v1alpha1"

I/O Contract

Inputs

Name Type Required Description
nodes map[string]InferenceRouter Yes Graph nodes (must include "root")
nodes[].routerType InferenceRouterType Yes Sequence, Ensemble, Splitter, or Switch
nodes[].steps []InferenceStep Yes Steps within the node
steps[].serviceName string One of three Target InferenceService name
steps[].nodeName string One of three Target graph node name
steps[].serviceURL string One of three Target external URL
steps[].data string No $request or $response forwarding
steps[].weight *int64 Splitter only Routing weight (must sum to 100)
steps[].condition string Switch only GJSON condition expression

Outputs

Name Type Description
Router Pod Pod Graph router binary serving on port 8080
status.url URL Pipeline entry endpoint
status.conditions []Condition GraphReady condition

Usage Examples

Ensemble Graph

apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
  name: model-ensemble
spec:
  nodes:
    root:
      routerType: Ensemble
      steps:
        - serviceName: sklearn-iris
          name: sklearn-iris
        - serviceName: xgboost-iris
          name: xgboost-iris

Splitter Graph (A/B Testing)

apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
  name: model-split
spec:
  nodes:
    root:
      routerType: Splitter
      steps:
        - serviceName: sklearn-iris
          name: sklearn-model
          weight: 80
        - serviceName: xgboost-iris
          name: xgboost-model
          weight: 20

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment