Principle:Kserve Kserve LLMIsvc Controller Reconciliation

Knowledge Sources	Kubernetes Controller Pattern KServe LLM Serving
Domains	Kubernetes, Operator_Pattern, LLM_Serving
Last Updated	2026-02-13 00:00 GMT

Overview

A specialized controller that reconciles LLMInferenceService resources into vLLM pods, scheduler deployments, InferencePool, and HTTPRoute resources.

Description

The LLMIsvc Controller is a dedicated controller manager (separate from the main KServe controller) that handles the LLMInferenceService lifecycle. Unlike the InferenceService controller, it:

Creates vLLM pods directly (not via Knative)
Manages Leader Worker Sets for multi-node inference
Creates InferencePool and HTTPRoute resources for Gateway API routing
Deploys a scheduler pod for intelligent endpoint selection
Supports v1alpha1 ↔ v1alpha2 API conversion via webhooks

Usage

This controller runs automatically after the LLMIsvc subsystem is installed. It processes LLMInferenceService create/update/delete events.

Theoretical Basis

# LLMIsvc reconciliation flow (NOT implementation code)
1. Watch LLMInferenceService events
2. Validate via webhooks (v1alpha1/v1alpha2)
3. Load LLMInferenceServiceConfig templates
4. Create/update:
   a. vLLM pods (decode pool)
   b. Prefill pods (if spec.prefill defined)
   c. Worker pods (if spec.worker defined)
   d. InferencePool resource
   e. HTTPRoute resource
   f. Scheduler deployment
5. Update status conditions

Related Pages

Implemented By

Implementation:Kserve_Kserve_LLMIsvc_Controller

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment