Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haotian liu LLaVA Controller Class

From Leeroopedia
Revision as of 12:55, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Haotian_liu_LLaVA_Controller_Class.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Overview

Concrete tool for managing distributed model workers via a FastAPI-based controller server. The Controller class provides centralized worker management with HTTP endpoints for registration, heartbeat, model listing, and inference routing.

Source

  • File: llava/serve/controller.py
  • Lines: L57-171 (Controller class), L239-298 (FastAPI routes + main)

Signature

class Controller:
    def __init__(self, dispatch_method: str):
        # dispatch_method: 'lottery' or 'shortest_queue'

    def register_worker(self, worker_name: str, check_heart_beat: bool, worker_status: dict) -> bool:
        """Register a worker and optionally start heartbeat monitoring."""

    def get_worker_address(self, model_name: str) -> str:
        """Return the address of a worker serving the given model using the configured dispatch method."""

    def list_models(self) -> List[str]:
        """Aggregate and return all model names served by registered workers."""

    def remove_worker(self, worker_name: str):
        """Remove a worker from the registry (called on heartbeat expiration)."""

    def refresh_all_workers(self):
        """Ping all registered workers and remove unresponsive ones."""

CLI Usage

python -m llava.serve.controller \
    --host localhost \
    --port 21001 \
    --dispatch-method shortest_queue

FastAPI Endpoints

Endpoint Method Description
/register_worker POST Register a new worker or update an existing one
/list_models POST Return list of all available model names
/get_worker_address POST Get a worker address for a given model (uses dispatch method)
/receive_heart_beat POST Receive heartbeat from a worker, reset expiration timer
/worker_generate_stream POST Proxy a streaming generation request to the selected worker

Import

from llava.serve.controller import Controller

Inputs

None (standalone server). The controller is configured via CLI arguments at launch time.

Outputs

Running HTTP server on {host}:{port} that manages worker registration, dispatch, and inference proxying.

Description

The Controller class is the central coordinator in LLaVA's distributed serving architecture. It uses uvicorn as the ASGI server and exposes a FastAPI application.

Key behaviors:

  • On /register_worker, the controller stores the worker address, its status (including speed and model names), and optionally starts a heartbeat monitoring thread.
  • The heartbeat thread checks every 30 seconds whether the worker has sent a heartbeat within the last 90 seconds. If not, the worker is removed.
  • On /get_worker_address, the controller selects a worker based on the configured dispatch method:
    • lottery -- Random selection weighted by worker speed.
    • shortest_queue -- Selects the worker with the smallest queue length.
  • On /worker_generate_stream, the controller resolves a worker address and proxies the streaming inference request.

Metadata

Field Value
Knowledge Sources Repo - LLaVA - https://github.com/haotian-liu/LLaVA
Domains Distributed_Systems, Model_Serving
Last Updated 2026-02-13 14:00 GMT

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment