Implementation:Kserve Kserve InferenceGraph Full CRD

Knowledge Sources	Kserve_Kserve KServe Docs
Domains	Kubernetes, CRD, Inference Pipeline
Last Updated	2026-02-13 00:00 GMT

Overview

Concrete CRD definition for the InferenceGraph custom resource in the KServe serving API, providing full OpenAPI v3 schema validation.

Description

This file contains the full CustomResourceDefinition for the InferenceGraph kind (short name: ig), produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1alpha1 and is a namespaced resource. The CRD enables composing multiple inference services into directed acyclic graphs for complex ML pipelines, supporting Sequence, Splitter, Ensemble, and Switch router types. It includes printer columns for URL, Ready status, and Age, and defines a status subresource for controller-managed state.

Usage

Apply this CRD during KServe installation to register the InferenceGraph API with the Kubernetes API server. Once registered, users can create InferenceGraph resources to define multi-step inference pipelines that chain together multiple model serving endpoints using different routing patterns.

Code Reference

Source Location

Repository: Kserve_Kserve
File: config/crd/full/serving.kserve.io_inferencegraphs.yaml
Lines: 1-653

Signature

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.19.0
  name: inferencegraphs.serving.kserve.io
spec:
  group: serving.kserve.io
  names:
    kind: InferenceGraph
    listKind: InferenceGraphList
    plural: inferencegraphs
    shortNames:
      - ig
    singular: inferencegraph
  scope: Namespaced
  versions:
    - name: v1alpha1
      additionalPrinterColumns:
        - jsonPath: .status.url
          name: URL
          type: string
        - jsonPath: .status.conditions[?(@.type=='Ready')].status
          name: Ready
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
      subresources:
        status: {}

Import

kubectl apply -f config/crd/full/serving.kserve.io_inferencegraphs.yaml

I/O Contract

Inputs

Name	Type	Required	Description
apiVersion	string	Yes	Must be `serving.kserve.io/v1alpha1`
kind	string	Yes	Must be `InferenceGraph`
metadata	ObjectMeta	Yes	Standard Kubernetes object metadata
spec	InferenceGraphSpec	Yes	Graph specification defining the node topology and routing behavior

Key spec fields:

Field	Type	Required	Description
spec.nodes	map[string]InferenceRouter	Yes	Map of named graph nodes, each defining a router type and steps; must include a `root` node as the entry point
spec.nodes[*].routerType	enum	No	Routing strategy: `Sequence`, `Splitter`, `Ensemble`, or `Switch`
spec.nodes[*].steps	[]InferenceStep	No	Ordered list of inference steps within this node, each referencing a service or another node
spec.nodes[].steps[].condition	string	No	Condition expression for Switch router to evaluate
spec.nodes[].steps[].dependency	enum	No	Dependency type: `Soft` or `Hard`
spec.affinity	Affinity	No	Pod scheduling affinity rules for the graph router pod
spec.tolerations	[]Toleration	No	Pod tolerations for scheduling on tainted nodes

Outputs

Name	Type	Description
status.url	string	Endpoint URL for the inference graph
status.conditions	[]Condition	Knative-style conditions including Ready status with reason and message
status.deploymentMode	string	The deployment mode used for the graph
status.observedGeneration	int64	The generation most recently observed by the controller
status.annotations	map[string]string	Controller-managed annotations

Usage Examples

Create an InferenceGraph

apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
  name: ensemble-pipeline
  namespace: default
spec:
  nodes:
    root:
      routerType: Ensemble
      steps:
        - serviceName: model-a
          weight: 50
        - serviceName: model-b
          weight: 50

Check Status

kubectl get ig ensemble-pipeline
# NAME                URL                                              READY   AGE
# ensemble-pipeline   http://ensemble-pipeline.default.example.com     True    5m

Related Pages

Principle:Kserve_Kserve_InferenceGraph_Specification
Kserve_Kserve_InferenceGraph_CRD_Spec -- Go type definitions for the InferenceGraph resource
Kserve_Kserve_Graph_Router_Engine -- Router engine implementation
Kserve_Kserve_Graph_Pipeline_Validation -- Pipeline validation logic
Kserve_Kserve_InferenceRouterType_Enum -- Router type enumeration definitions

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment