Principle:Kserve Kserve LLMIsvc Controller Reconciliation
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Kubernetes, Operator_Pattern, LLM_Serving |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
A specialized controller that reconciles LLMInferenceService resources into vLLM pods, scheduler deployments, InferencePool, and HTTPRoute resources.
Description
The LLMIsvc Controller is a dedicated controller manager (separate from the main KServe controller) that handles the LLMInferenceService lifecycle. Unlike the InferenceService controller, it:
- Creates vLLM pods directly (not via Knative)
- Manages Leader Worker Sets for multi-node inference
- Creates InferencePool and HTTPRoute resources for Gateway API routing
- Deploys a scheduler pod for intelligent endpoint selection
- Supports v1alpha1 ↔ v1alpha2 API conversion via webhooks
Usage
This controller runs automatically after the LLMIsvc subsystem is installed. It processes LLMInferenceService create/update/delete events.
Theoretical Basis
# LLMIsvc reconciliation flow (NOT implementation code)
1. Watch LLMInferenceService events
2. Validate via webhooks (v1alpha1/v1alpha2)
3. Load LLMInferenceServiceConfig templates
4. Create/update:
a. vLLM pods (decode pool)
b. Prefill pods (if spec.prefill defined)
c. Worker pods (if spec.worker defined)
d. InferencePool resource
e. HTTPRoute resource
f. Scheduler deployment
5. Update status conditions
Related Pages
Implemented By
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment