Implementation:Kserve Kserve InferenceGraph CRD Spec
| Knowledge Sources | |
|---|---|
| Domains | Pipeline, Kubernetes, Model_Serving |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete Go type definitions for the InferenceGraph CRD, including InferenceGraphSpec, InferenceRouter, and InferenceStep types.
Description
The InferenceGraph CRD is defined in pkg/apis/serving/v1alpha1/inference_graph.go. The spec contains a map of named InferenceRouter nodes. Each router has a RouterType and list of InferenceStep entries. Steps target InferenceServices (by name), other graph nodes, or external URLs. Validation enforces a root node, unique step names, and splitter weight constraints.
Usage
Write InferenceGraph YAML manifests referencing running InferenceService components by their serviceName.
Code Reference
Source Location
- Repository: kserve
- File: pkg/apis/serving/v1alpha1/inference_graph.go, Lines 35-305
- File: pkg/apis/serving/v1alpha1/inference_graph_validation.go, Lines 109-215
Signature
// InferenceGraphSpec defines the graph structure
type InferenceGraphSpec struct {
Nodes map[string]InferenceRouter `json:"nodes"`
Resources corev1.ResourceRequirements `json:"resources,omitempty"`
TimeoutSeconds *int64 `json:"timeoutSeconds,omitempty"`
MinReplicas *int32 `json:"minReplicas,omitempty"`
MaxReplicas int32 `json:"maxReplicas,omitempty"`
}
// InferenceRouter defines a node in the graph
type InferenceRouter struct {
RouterType InferenceRouterType `json:"routerType"`
Steps []InferenceStep `json:"steps"`
}
// InferenceStep defines a step within a router node
type InferenceStep struct {
StepName string `json:"name,omitempty"`
InferenceTarget `json:",inline"`
Data string `json:"data,omitempty"`
Weight *int64 `json:"weight,omitempty"`
Condition string `json:"condition,omitempty"`
Dependency InferenceDependency `json:"dependency,omitempty"`
}
// InferenceTarget specifies the step's target (exactly one)
type InferenceTarget struct {
NodeName string `json:"nodeName,omitempty"`
ServiceName string `json:"serviceName,omitempty"`
ServiceURL string `json:"serviceURL,omitempty"`
}
Import
import servingv1alpha1 "github.com/kserve/kserve/pkg/apis/serving/v1alpha1"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| nodes | map[string]InferenceRouter | Yes | Graph nodes (must include "root") |
| nodes[].routerType | InferenceRouterType | Yes | Sequence, Ensemble, Splitter, or Switch |
| nodes[].steps | []InferenceStep | Yes | Steps within the node |
| steps[].serviceName | string | One of three | Target InferenceService name |
| steps[].nodeName | string | One of three | Target graph node name |
| steps[].serviceURL | string | One of three | Target external URL |
| steps[].data | string | No | $request or $response forwarding |
| steps[].weight | *int64 | Splitter only | Routing weight (must sum to 100) |
| steps[].condition | string | Switch only | GJSON condition expression |
Outputs
| Name | Type | Description |
|---|---|---|
| Router Pod | Pod | Graph router binary serving on port 8080 |
| status.url | URL | Pipeline entry endpoint |
| status.conditions | []Condition | GraphReady condition |
Usage Examples
Ensemble Graph
apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
name: model-ensemble
spec:
nodes:
root:
routerType: Ensemble
steps:
- serviceName: sklearn-iris
name: sklearn-iris
- serviceName: xgboost-iris
name: xgboost-iris
Splitter Graph (A/B Testing)
apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
name: model-split
spec:
nodes:
root:
routerType: Splitter
steps:
- serviceName: sklearn-iris
name: sklearn-model
weight: 80
- serviceName: xgboost-iris
name: xgboost-model
weight: 20