Implementation:Kserve Kserve Gateway Inference Extension CRDs

Knowledge Sources	Kserve_Kserve
Domains	Kubernetes, Gateway API, LLM Inference
Last Updated	2026-02-13 00:00 GMT

Overview

This file defines the CRDs for the Gateway Inference Extension API (InferenceObjective, InferencePoolImport, and InferencePool) under the inference.networking.x-k8s.io API group.

Description

Part of the Kubernetes Gateway API Inference Extension (bundle version v1.2.0), this file contains full OpenAPI v3 schema definitions for custom resources that enable Kubernetes-native scheduling and routing for LLM inference workloads. The InferenceObjective resource represents a desired model use case with priority-based routing, the InferencePoolImport enables cross-namespace pool sharing, and the InferencePool defines a pool of inference endpoints with selector-based workload matching, endpoint picker configurations, and target port specifications.

Usage

Apply these CRDs to enable the Gateway API inference extension layer in a KServe LLMInferenceService deployment. They provide the foundation for intelligent request routing, priority-based scheduling, and load balancing across inference pools.

Code Reference

Source Location

Repository: Kserve_Kserve
File: config/llmisvc/gateway-inference-extension.yaml

Signature

# CRD 1: InferenceObjective
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    inference.networking.k8s.io/bundle-version: v1.2.0
  name: inferenceobjectives.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferenceObjective
    plural: inferenceobjectives
  scope: Namespaced
  versions:
  - name: v1alpha2
    # Full OpenAPI v3 schema with poolRef, priority
---
# CRD 2: InferencePoolImport
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: inferencepoolimports.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferencePoolImport
    shortNames:
    - infpimp
  scope: Namespaced
---
# CRD 3: InferencePool
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: inferencepools.inference.networking.x-k8s.io
spec:
  group: inference.networking.x-k8s.io
  names:
    kind: InferencePool
    shortNames:
    - infpool
  scope: Namespaced

Import

kubectl apply -f config/llmisvc/gateway-inference-extension.yaml

I/O Contract

InferenceObjective Spec Fields

Field	Type	Required	Description
`spec.poolRef`	object	Yes	Reference to the inference pool (must exist in the same namespace)
`spec.poolRef.name`	string	Yes	Name of the target InferencePool
`spec.poolRef.group`	string	No	API group (default: `inference.networking.k8s.io`)
`spec.poolRef.kind`	string	No	Resource kind (default: `InferencePool`)
`spec.priority`	integer	No	Priority for request scheduling; higher values are served first (default treated as 0)

InferenceObjective Printer Columns

Column	JSONPath	Type
Inference Pool	`.spec.poolRef.name`	string
Priority	`.spec.priority`	string
Age	`.metadata.creationTimestamp`	date

InferenceObjective Status Fields

Field	Type	Description
`status.conditions`	array	Condition tracking (known type: `Accepted`); max 8 items

InferencePoolImport Status Fields

Field	Type	Description
`status.controllers`	array	List of controllers managing the import, each with conditions and controller name

InferencePool Spec Fields

Field	Type	Required	Description
`spec.endpointPickerConfig`	object	No	Configuration for the endpoint picker extension
`spec.selector`	map	Yes	Label selector to match pods in the inference pool
`spec.targetPortNumber`	integer	Yes	Port number on backend pods (1-65535)

Usage Examples

Deploy the Gateway Inference Extension CRDs and create an InferenceObjective:

# Apply the Gateway Inference Extension CRDs
kubectl apply -f config/llmisvc/gateway-inference-extension.yaml

# Verify all CRDs are registered
kubectl get crd | grep inference.networking.x-k8s.io

# Example InferenceObjective
apiVersion: inference.networking.x-k8s.io/v1alpha2
kind: InferenceObjective
metadata:
  name: my-model-objective
spec:
  poolRef:
    name: my-inference-pool
  priority: 10

Related Pages

Kserve_Kserve_LLMInferenceService_Minimal_CRD - LLMInferenceService CRD that uses these gateway extensions for routing
Kserve_Kserve_DeepSeek_R1_PD_DeepEP_HT_Sample - Sample deployment leveraging gateway routing
Kserve_Kserve_DeepSeek_R1_PD_DeepEP_Pplx_Sample - Hybrid backend sample using inference pools

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment