Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Kserve Kserve Controller Deployment

From Leeroopedia
Knowledge Sources
Domains Kubernetes, Operator_Pattern, Infrastructure
Last Updated 2026-02-13 00:00 GMT

Overview

A deployment pattern for packaging Kubernetes operators as controller manager pods that watch custom resources and reconcile cluster state toward the desired configuration.

Description

Controller Deployment defines how KServe operator binaries are packaged and deployed into a Kubernetes cluster. KServe uses multiple dedicated controller managers, each responsible for a distinct set of custom resources:

  • kserve-controller-manager -- the primary controller handling InferenceService, InferenceGraph, ServingRuntime, and TrainedModel resources.
  • llmisvc-controller-manager -- a separate controller for LLMInferenceService and LLMInferenceServiceConfig resources, with its own webhook and RBAC configuration.
  • localmodel-controller-manager -- manages LocalModelCache and LocalModelNodeGroup resources for node-local model caching.
  • localmodelnode-agent -- a DaemonSet (rather than a Deployment) that runs on every eligible node to manage the actual model file downloads and lifecycle on local storage.

Each controller follows the standard Kubernetes operator deployment pattern: a Deployment (or DaemonSet) with leader election, RBAC service accounts, and certificate injection for webhooks.

Usage

Use this principle when:

  • Installing or upgrading KServe on a Kubernetes cluster
  • Configuring resource limits and replicas for operator pods
  • Debugging controller availability or leader election issues
  • Understanding the separation of concerns between KServe control plane components

Theoretical Basis

# Kubernetes operator deployment pattern (NOT implementation code)
Deployment-based controller:
  1. Deployment creates a single-replica pod (or HA with leader election)
  2. Pod runs the controller-manager binary
  3. Binary registers watchers for its CRD types
  4. Leader election ensures only one active reconciler
  5. Webhooks served via TLS certificates (cert-manager injection)

DaemonSet-based agent:
  1. DaemonSet ensures one pod per eligible node (via nodeSelector/tolerations)
  2. Agent pod has hostPath volume mounts for local model storage
  3. Agent watches LocalModelNode resources assigned to its node
  4. Downloads or evicts models from local disk

KServe control plane topology:
  kserve-controller-manager (Deployment)
    → InferenceService, InferenceGraph, ServingRuntime, TrainedModel
  llmisvc-controller-manager (Deployment)
    → LLMInferenceService, LLMInferenceServiceConfig
  localmodel-controller-manager (Deployment)
    → LocalModelCache, LocalModelNodeGroup
  localmodelnode-agent (DaemonSet)
    → LocalModelNode (per-node model file management)

Related Pages

Implemented By

Related Principles

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment