Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Kserve Kserve LLMIsvc Controller Reconciliation

From Leeroopedia
Knowledge Sources
Domains Kubernetes, Operator_Pattern, LLM_Serving
Last Updated 2026-02-13 00:00 GMT

Overview

A specialized controller that reconciles LLMInferenceService resources into vLLM pods, scheduler deployments, InferencePool, and HTTPRoute resources.

Description

The LLMIsvc Controller is a dedicated controller manager (separate from the main KServe controller) that handles the LLMInferenceService lifecycle. Unlike the InferenceService controller, it:

  • Creates vLLM pods directly (not via Knative)
  • Manages Leader Worker Sets for multi-node inference
  • Creates InferencePool and HTTPRoute resources for Gateway API routing
  • Deploys a scheduler pod for intelligent endpoint selection
  • Supports v1alpha1 ↔ v1alpha2 API conversion via webhooks

Usage

This controller runs automatically after the LLMIsvc subsystem is installed. It processes LLMInferenceService create/update/delete events.

Theoretical Basis

# LLMIsvc reconciliation flow (NOT implementation code)
1. Watch LLMInferenceService events
2. Validate via webhooks (v1alpha1/v1alpha2)
3. Load LLMInferenceServiceConfig templates
4. Create/update:
   a. vLLM pods (decode pool)
   b. Prefill pods (if spec.prefill defined)
   c. Worker pods (if spec.worker defined)
   d. InferencePool resource
   e. HTTPRoute resource
   f. Scheduler deployment
5. Update status conditions

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment