Implementation:Kserve Kserve InferenceGraph Full CRD
| Knowledge Sources | |
|---|---|
| Domains | Kubernetes, CRD, Inference Pipeline |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete CRD definition for the InferenceGraph custom resource in the KServe serving API, providing full OpenAPI v3 schema validation.
Description
This file contains the full CustomResourceDefinition for the InferenceGraph kind (short name: ig), produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1alpha1 and is a namespaced resource. The CRD enables composing multiple inference services into directed acyclic graphs for complex ML pipelines, supporting Sequence, Splitter, Ensemble, and Switch router types. It includes printer columns for URL, Ready status, and Age, and defines a status subresource for controller-managed state.
Usage
Apply this CRD during KServe installation to register the InferenceGraph API with the Kubernetes API server. Once registered, users can create InferenceGraph resources to define multi-step inference pipelines that chain together multiple model serving endpoints using different routing patterns.
Code Reference
Source Location
- Repository: Kserve_Kserve
- File: config/crd/full/serving.kserve.io_inferencegraphs.yaml
- Lines: 1-653
Signature
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.19.0
name: inferencegraphs.serving.kserve.io
spec:
group: serving.kserve.io
names:
kind: InferenceGraph
listKind: InferenceGraphList
plural: inferencegraphs
shortNames:
- ig
singular: inferencegraph
scope: Namespaced
versions:
- name: v1alpha1
additionalPrinterColumns:
- jsonPath: .status.url
name: URL
type: string
- jsonPath: .status.conditions[?(@.type=='Ready')].status
name: Ready
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
subresources:
status: {}
Import
kubectl apply -f config/crd/full/serving.kserve.io_inferencegraphs.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| apiVersion | string | Yes | Must be serving.kserve.io/v1alpha1
|
| kind | string | Yes | Must be InferenceGraph
|
| metadata | ObjectMeta | Yes | Standard Kubernetes object metadata |
| spec | InferenceGraphSpec | Yes | Graph specification defining the node topology and routing behavior |
Key spec fields:
| Field | Type | Required | Description |
|---|---|---|---|
| spec.nodes | map[string]InferenceRouter | Yes | Map of named graph nodes, each defining a router type and steps; must include a root node as the entry point
|
| spec.nodes[*].routerType | enum | No | Routing strategy: Sequence, Splitter, Ensemble, or Switch
|
| spec.nodes[*].steps | []InferenceStep | No | Ordered list of inference steps within this node, each referencing a service or another node |
| spec.nodes[*].steps[*].condition | string | No | Condition expression for Switch router to evaluate |
| spec.nodes[*].steps[*].dependency | enum | No | Dependency type: Soft or Hard
|
| spec.affinity | Affinity | No | Pod scheduling affinity rules for the graph router pod |
| spec.tolerations | []Toleration | No | Pod tolerations for scheduling on tainted nodes |
Outputs
| Name | Type | Description |
|---|---|---|
| status.url | string | Endpoint URL for the inference graph |
| status.conditions | []Condition | Knative-style conditions including Ready status with reason and message |
| status.deploymentMode | string | The deployment mode used for the graph |
| status.observedGeneration | int64 | The generation most recently observed by the controller |
| status.annotations | map[string]string | Controller-managed annotations |
Usage Examples
Create an InferenceGraph
apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
name: ensemble-pipeline
namespace: default
spec:
nodes:
root:
routerType: Ensemble
steps:
- serviceName: model-a
weight: 50
- serviceName: model-b
weight: 50
Check Status
kubectl get ig ensemble-pipeline
# NAME URL READY AGE
# ensemble-pipeline http://ensemble-pipeline.default.example.com True 5m
Related Pages
- Principle:Kserve_Kserve_InferenceGraph_Specification
- Kserve_Kserve_InferenceGraph_CRD_Spec -- Go type definitions for the InferenceGraph resource
- Kserve_Kserve_Graph_Router_Engine -- Router engine implementation
- Kserve_Kserve_Graph_Pipeline_Validation -- Pipeline validation logic
- Kserve_Kserve_InferenceRouterType_Enum -- Router type enumeration definitions