Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Kserve Kserve InferenceGraph Full CRD

From Leeroopedia
Revision as of 13:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Kserve_Kserve_InferenceGraph_Full_CRD.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Kubernetes, CRD, Inference Pipeline
Last Updated 2026-02-13 00:00 GMT

Overview

Concrete CRD definition for the InferenceGraph custom resource in the KServe serving API, providing full OpenAPI v3 schema validation.

Description

This file contains the full CustomResourceDefinition for the InferenceGraph kind (short name: ig), produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1alpha1 and is a namespaced resource. The CRD enables composing multiple inference services into directed acyclic graphs for complex ML pipelines, supporting Sequence, Splitter, Ensemble, and Switch router types. It includes printer columns for URL, Ready status, and Age, and defines a status subresource for controller-managed state.

Usage

Apply this CRD during KServe installation to register the InferenceGraph API with the Kubernetes API server. Once registered, users can create InferenceGraph resources to define multi-step inference pipelines that chain together multiple model serving endpoints using different routing patterns.

Code Reference

Source Location

Signature

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.19.0
  name: inferencegraphs.serving.kserve.io
spec:
  group: serving.kserve.io
  names:
    kind: InferenceGraph
    listKind: InferenceGraphList
    plural: inferencegraphs
    shortNames:
      - ig
    singular: inferencegraph
  scope: Namespaced
  versions:
    - name: v1alpha1
      additionalPrinterColumns:
        - jsonPath: .status.url
          name: URL
          type: string
        - jsonPath: .status.conditions[?(@.type=='Ready')].status
          name: Ready
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
      subresources:
        status: {}

Import

kubectl apply -f config/crd/full/serving.kserve.io_inferencegraphs.yaml

I/O Contract

Inputs

Name Type Required Description
apiVersion string Yes Must be serving.kserve.io/v1alpha1
kind string Yes Must be InferenceGraph
metadata ObjectMeta Yes Standard Kubernetes object metadata
spec InferenceGraphSpec Yes Graph specification defining the node topology and routing behavior

Key spec fields:

Field Type Required Description
spec.nodes map[string]InferenceRouter Yes Map of named graph nodes, each defining a router type and steps; must include a root node as the entry point
spec.nodes[*].routerType enum No Routing strategy: Sequence, Splitter, Ensemble, or Switch
spec.nodes[*].steps []InferenceStep No Ordered list of inference steps within this node, each referencing a service or another node
spec.nodes[*].steps[*].condition string No Condition expression for Switch router to evaluate
spec.nodes[*].steps[*].dependency enum No Dependency type: Soft or Hard
spec.affinity Affinity No Pod scheduling affinity rules for the graph router pod
spec.tolerations []Toleration No Pod tolerations for scheduling on tainted nodes

Outputs

Name Type Description
status.url string Endpoint URL for the inference graph
status.conditions []Condition Knative-style conditions including Ready status with reason and message
status.deploymentMode string The deployment mode used for the graph
status.observedGeneration int64 The generation most recently observed by the controller
status.annotations map[string]string Controller-managed annotations

Usage Examples

Create an InferenceGraph

apiVersion: serving.kserve.io/v1alpha1
kind: InferenceGraph
metadata:
  name: ensemble-pipeline
  namespace: default
spec:
  nodes:
    root:
      routerType: Ensemble
      steps:
        - serviceName: model-a
          weight: 50
        - serviceName: model-b
          weight: 50

Check Status

kubectl get ig ensemble-pipeline
# NAME                URL                                              READY   AGE
# ensemble-pipeline   http://ensemble-pipeline.default.example.com     True    5m

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment