Implementation:OpenGVLab InternVL Streamlit Controller
| Knowledge Sources | |
|---|---|
| Domains | Serving, Distributed Systems, Demo |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
FastAPI-based controller server for the Streamlit demo that manages distributed model workers, handling registration, health monitoring, load balancing, and streaming request routing with InternVL-aware model sorting.
Description
This controller is a variant of the LLaVA controller, adapted for the InternVL Streamlit demo. The Controller class maintains a registry of WorkerInfo instances tracking model names, speed, queue length, and heartbeat status. It supports two DispatchMethod strategies: LOTTERY (weighted random by worker speed) and SHORTEST_QUEUE (routes to lowest queue-to-speed ratio worker).
Key differences from the LLaVA controller include:
- InternVL-aware model sorting: The list_models() method uses a custom sort that orders models by size (extracted from "InternVL2-NB" naming patterns), placing "Pro" models first (priority 999) and non-matching models last, providing a user-friendly model selection order in the demo UI.
- Default port: Runs on 0.0.0.0:10075 (vs. localhost:21001 for LLaVA).
- Heartbeat expiration: Set to 30 seconds as a module-level constant.
The controller provides REST endpoints: /register_worker, /refresh_all_workers, /list_models (with sorted output), /get_worker_address, /receive_heart_beat, /worker_generate_stream (proxy), and /worker_get_status (aggregated). The controller can also act as a hierarchical worker itself for connecting isolated sub-networks.
Usage
Use this controller as the central orchestration service for the InternVL Streamlit demo, coordinating multiple GPU workers serving different InternVL model variants.
Code Reference
Source Location
- Repository: OpenGVLab_InternVL
- File: streamlit_demo/controller.py
- Lines: 1-291
Signature
class DispatchMethod(Enum):
LOTTERY = auto()
SHORTEST_QUEUE = auto()
@dataclasses.dataclass
class WorkerInfo:
model_names: List[str]
speed: int
queue_length: int
check_heart_beat: bool
last_heart_beat: str
class Controller:
def __init__(self, dispatch_method: str): ...
def register_worker(self, worker_name, check_heart_beat, worker_status): ...
def list_models(self): ... # InternVL-sorted
def get_worker_address(self, model_name): ...
def receive_heart_beat(self, worker_name, queue_length): ...
def worker_api_generate_stream(self, params): ...
def worker_api_get_status(self): ...
Import
# Standalone server script:
# python streamlit_demo/controller.py --host 0.0.0.0 --port 10075 --dispatch-method shortest_queue
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --host | str | No | Server host (default: "0.0.0.0") |
| --port | int | No | Server port (default: 10075) |
| --dispatch-method | str | No | Load balancing: "lottery" or "shortest_queue" (default: "shortest_queue") |
Outputs
| Name | Type | Description |
|---|---|---|
| REST API | FastAPI endpoints | Worker management, model listing (sorted by InternVL model size), request dispatch |
Usage Examples
Basic Usage
# Start the Streamlit demo controller:
# python streamlit_demo/controller.py --port 10075
# Workers register and report heartbeats
# The Streamlit UI queries /list_models for sorted model selection
# Generation requests are proxied via /worker_generate_stream