Overview
This file defines the CRDs for the Gateway Inference Extension API (InferenceObjective, InferencePoolImport, and InferencePool) under the inference.networking.x-k8s.io API group.
Description
Part of the Kubernetes Gateway API Inference Extension (bundle version v1.2.0), this file contains full OpenAPI v3 schema definitions for custom resources that enable Kubernetes-native scheduling and routing for LLM inference workloads. The InferenceObjective resource represents a desired model use case with priority-based routing, the InferencePoolImport enables cross-namespace pool sharing, and the InferencePool defines a pool of inference endpoints with selector-based workload matching, endpoint picker configurations, and target port specifications.
Usage
Apply these CRDs to enable the Gateway API inference extension layer in a KServe LLMInferenceService deployment. They provide the foundation for intelligent request routing, priority-based scheduling, and load balancing across inference pools.
Code Reference
Source Location
Signature
# CRD 1: InferenceObjective
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
inference.networking.k8s.io/bundle-version: v1.2.0
name: inferenceobjectives.inference.networking.x-k8s.io
spec:
group: inference.networking.x-k8s.io
names:
kind: InferenceObjective
plural: inferenceobjectives
scope: Namespaced
versions:
- name: v1alpha2
# Full OpenAPI v3 schema with poolRef, priority
---
# CRD 2: InferencePoolImport
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: inferencepoolimports.inference.networking.x-k8s.io
spec:
group: inference.networking.x-k8s.io
names:
kind: InferencePoolImport
shortNames:
- infpimp
scope: Namespaced
---
# CRD 3: InferencePool
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: inferencepools.inference.networking.x-k8s.io
spec:
group: inference.networking.x-k8s.io
names:
kind: InferencePool
shortNames:
- infpool
scope: Namespaced
Import
kubectl apply -f config/llmisvc/gateway-inference-extension.yaml
I/O Contract
InferenceObjective Spec Fields
| Field |
Type |
Required |
Description
|
spec.poolRef |
object |
Yes |
Reference to the inference pool (must exist in the same namespace)
|
spec.poolRef.name |
string |
Yes |
Name of the target InferencePool
|
spec.poolRef.group |
string |
No |
API group (default: inference.networking.k8s.io)
|
spec.poolRef.kind |
string |
No |
Resource kind (default: InferencePool)
|
spec.priority |
integer |
No |
Priority for request scheduling; higher values are served first (default treated as 0)
|
InferenceObjective Printer Columns
| Column |
JSONPath |
Type
|
| Inference Pool |
.spec.poolRef.name |
string
|
| Priority |
.spec.priority |
string
|
| Age |
.metadata.creationTimestamp |
date
|
InferenceObjective Status Fields
| Field |
Type |
Description
|
status.conditions |
array |
Condition tracking (known type: Accepted); max 8 items
|
InferencePoolImport Status Fields
| Field |
Type |
Description
|
status.controllers |
array |
List of controllers managing the import, each with conditions and controller name
|
InferencePool Spec Fields
| Field |
Type |
Required |
Description
|
spec.endpointPickerConfig |
object |
No |
Configuration for the endpoint picker extension
|
spec.selector |
map |
Yes |
Label selector to match pods in the inference pool
|
spec.targetPortNumber |
integer |
Yes |
Port number on backend pods (1-65535)
|
Usage Examples
Deploy the Gateway Inference Extension CRDs and create an InferenceObjective:
# Apply the Gateway Inference Extension CRDs
kubectl apply -f config/llmisvc/gateway-inference-extension.yaml
# Verify all CRDs are registered
kubectl get crd | grep inference.networking.x-k8s.io
# Example InferenceObjective
apiVersion: inference.networking.x-k8s.io/v1alpha2
kind: InferenceObjective
metadata:
name: my-model-objective
spec:
poolRef:
name: my-inference-pool
priority: 10
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.