Implementation:OpenGVLab InternVL Streamlit Controller

Knowledge Sources	OpenGVLab_InternVL
Domains	Serving, Distributed Systems, Demo
Last Updated	2026-02-07 14:00 GMT

Overview

FastAPI-based controller server for the Streamlit demo that manages distributed model workers, handling registration, health monitoring, load balancing, and streaming request routing with InternVL-aware model sorting.

Description

This controller is a variant of the LLaVA controller, adapted for the InternVL Streamlit demo. The Controller class maintains a registry of WorkerInfo instances tracking model names, speed, queue length, and heartbeat status. It supports two DispatchMethod strategies: LOTTERY (weighted random by worker speed) and SHORTEST_QUEUE (routes to lowest queue-to-speed ratio worker).

Key differences from the LLaVA controller include:

InternVL-aware model sorting: The list_models() method uses a custom sort that orders models by size (extracted from "InternVL2-NB" naming patterns), placing "Pro" models first (priority 999) and non-matching models last, providing a user-friendly model selection order in the demo UI.
Default port: Runs on 0.0.0.0:10075 (vs. localhost:21001 for LLaVA).
Heartbeat expiration: Set to 30 seconds as a module-level constant.

The controller provides REST endpoints: /register_worker, /refresh_all_workers, /list_models (with sorted output), /get_worker_address, /receive_heart_beat, /worker_generate_stream (proxy), and /worker_get_status (aggregated). The controller can also act as a hierarchical worker itself for connecting isolated sub-networks.

Usage

Use this controller as the central orchestration service for the InternVL Streamlit demo, coordinating multiple GPU workers serving different InternVL model variants.

Code Reference

Source Location

Repository: OpenGVLab_InternVL
File: streamlit_demo/controller.py
Lines: 1-291

Signature

class DispatchMethod(Enum):
    LOTTERY = auto()
    SHORTEST_QUEUE = auto()

@dataclasses.dataclass
class WorkerInfo:
    model_names: List[str]
    speed: int
    queue_length: int
    check_heart_beat: bool
    last_heart_beat: str

class Controller:
    def __init__(self, dispatch_method: str): ...
    def register_worker(self, worker_name, check_heart_beat, worker_status): ...
    def list_models(self): ...  # InternVL-sorted
    def get_worker_address(self, model_name): ...
    def receive_heart_beat(self, worker_name, queue_length): ...
    def worker_api_generate_stream(self, params): ...
    def worker_api_get_status(self): ...

Import

# Standalone server script:
# python streamlit_demo/controller.py --host 0.0.0.0 --port 10075 --dispatch-method shortest_queue

I/O Contract

Inputs

Name	Type	Required	Description
--host	str	No	Server host (default: "0.0.0.0")
--port	int	No	Server port (default: 10075)
--dispatch-method	str	No	Load balancing: "lottery" or "shortest_queue" (default: "shortest_queue")

Outputs

Name	Type	Description
REST API	FastAPI endpoints	Worker management, model listing (sorted by InternVL model size), request dispatch

Usage Examples

Basic Usage

# Start the Streamlit demo controller:
# python streamlit_demo/controller.py --port 10075

# Workers register and report heartbeats
# The Streamlit UI queries /list_models for sorted model selection
# Generation requests are proxied via /worker_generate_stream

Related Pages

Principle:OpenGVLab_InternVL_Distributed_Worker_Management

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment