Implementation:Haotian liu LLaVA Build Demo Gradio
Appearance
Overview
Concrete tool for building and launching LLaVA's Gradio-based web chat interface. The build_demo() function constructs the full UI, and http_bot() handles streaming inference.
Source
- File:
llava/serve/gradio_web_server.py - Lines: L315-452 (
build_demo), L154-286 (http_bot), L455-479 (main)
Signature
def build_demo(embed_mode: bool, cur_dir: str = None, concurrency_count: int = 10) -> gr.Blocks:
"""
Build and return a configured Gradio Blocks application.
Args:
embed_mode: If True, hide header/footer for embedding in other pages.
cur_dir: Current directory for serving static assets.
concurrency_count: Maximum number of concurrent requests.
Returns:
gr.Blocks: Configured Gradio application ready to launch.
"""
# Main launch pattern:
# demo.queue(api_open=False).launch(
# server_name=args.host,
# server_port=args.port,
# share=args.share
# )
CLI Usage
python -m llava.serve.gradio_web_server \
--controller http://localhost:21001 \
--port 7860
With sharing enabled:
python -m llava.serve.gradio_web_server \
--controller http://localhost:21001 \
--port 7860 \
--share
Import
from llava.serve.gradio_web_server import build_demo
Inputs
| Parameter | Type | Description |
|---|---|---|
controller_url |
str | URL of the controller (e.g., http://localhost:21001)
|
host |
str | Hostname to bind the Gradio server (default: 0.0.0.0)
|
port |
int | Port number for the Gradio server (default: 7860)
|
concurrency_count |
int | Maximum concurrent requests (default: 10)
|
share |
bool | Create a public Gradio share link |
embed_mode |
bool | Hide header/footer for embedding |
Outputs
Running Gradio web interface with the following UI components:
- Model selector dropdown (populated from controller's model list)
- Image upload widget for visual input
- Chat textbox for text input and conversation display
- Parameter sliders for temperature, max output tokens, and top-p
- Vote buttons (upvote, downvote, flag) for response quality feedback
Description
The build_demo() function constructs a Gradio Blocks interface that provides the full chat experience:
UI construction:
- Creates a
gr.Blockslayout with model selector, image upload, chatbot display, text input, and control buttons. - Wires event handlers: image upload triggers
add_image(), text submit triggersadd_text()thenhttp_bot(). - Vote buttons call logging functions to record user feedback.
Streaming inference (http_bot):
- Constructs a request payload with the conversation history and base64-encoded image.
- Sends a POST request to the controller's
/worker_generate_streamendpoint. - Iterates over the streaming response, yielding partial text to update the chat display.
- Handles errors (worker timeout, model not found) with user-friendly messages.
Metadata
| Field | Value |
|---|---|
| Knowledge Sources | Repo - LLaVA - https://github.com/haotian-liu/LLaVA |
| Domains | Web_Interface, Model_Serving |
| Last Updated | 2026-02-13 14:00 GMT |
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment