Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Kserve Kserve Gateway Inference Extension CRDs

From Leeroopedia
Knowledge Sources
Domains Kubernetes, Gateway API, LLM Inference
Last Updated 2026-02-13 00:00 GMT

Overview

This file defines the CRDs for the Gateway Inference Extension API (InferenceObjective, InferencePoolImport, and InferencePool) under the inference.networking.x-k8s.io API group.

Description

Part of the Kubernetes Gateway API Inference Extension (bundle version v1.2.0), this file contains full OpenAPI v3 schema definitions for custom resources that enable Kubernetes-native scheduling and routing for LLM inference workloads. The InferenceObjective resource represents a desired model use case with priority-based routing, the InferencePoolImport enables cross-namespace pool sharing, and the InferencePool defines a pool of inference endpoints with selector-based workload matching, endpoint picker configurations, and target port specifications.

Usage

Apply these CRDs to enable the Gateway API inference extension layer in a KServe LLMInferenceService deployment. They provide the foundation for intelligent request routing, priority-based scheduling, and load balancing across inference pools.

Code Reference

Source Location

Signature

# CRD 1: InferenceObjective
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    inference.networking.k8s.io/bundle-version: v1.2.0
  name: inferenceobjectives.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferenceObjective
    plural: inferenceobjectives
  scope: Namespaced
  versions:
  - name: v1alpha2
    # Full OpenAPI v3 schema with poolRef, priority
---
# CRD 2: InferencePoolImport
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: inferencepoolimports.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferencePoolImport
    shortNames:
    - infpimp
  scope: Namespaced
---
# CRD 3: InferencePool
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: inferencepools.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferencePool
    shortNames:
    - infpool
  scope: Namespaced

Import

kubectl apply -f config/llmisvc/gateway-inference-extension.yaml

I/O Contract

InferenceObjective Spec Fields

Field Type Required Description
spec.poolRef object Yes Reference to the inference pool (must exist in the same namespace)
spec.poolRef.name string Yes Name of the target InferencePool
spec.poolRef.group string No API group (default: inference.networking.k8s.io)
spec.poolRef.kind string No Resource kind (default: InferencePool)
spec.priority integer No Priority for request scheduling; higher values are served first (default treated as 0)

InferenceObjective Printer Columns

Column JSONPath Type
Inference Pool .spec.poolRef.name string
Priority .spec.priority string
Age .metadata.creationTimestamp date

InferenceObjective Status Fields

Field Type Description
status.conditions array Condition tracking (known type: Accepted); max 8 items

InferencePoolImport Status Fields

Field Type Description
status.controllers array List of controllers managing the import, each with conditions and controller name

InferencePool Spec Fields

Field Type Required Description
spec.endpointPickerConfig object No Configuration for the endpoint picker extension
spec.selector map Yes Label selector to match pods in the inference pool
spec.targetPortNumber integer Yes Port number on backend pods (1-65535)

Usage Examples

Deploy the Gateway Inference Extension CRDs and create an InferenceObjective:

# Apply the Gateway Inference Extension CRDs
kubectl apply -f config/llmisvc/gateway-inference-extension.yaml

# Verify all CRDs are registered
kubectl get crd | grep inference.networking.x-k8s.io
# Example InferenceObjective
apiVersion: inference.networking.x-k8s.io/v1alpha2
kind: InferenceObjective
metadata:
  name: my-model-objective
spec:
  poolRef:
    name: my-inference-pool
  priority: 10

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment