Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:OpenGVLab InternVL Streamlit Chat App

From Leeroopedia


Knowledge Sources
Domains Web Application, Multimodal Chat, Streamlit UI, InternVL2
Last Updated 2026-02-07 14:00 GMT

Overview

This module implements a Streamlit-based web application providing an interactive bilingual chat interface for InternVL2 multimodal models with image upload, bounding box visualization, and image generation capabilities.

Description

The app.py file is the primary user-facing demo application for InternVL2, built with Streamlit. It provides a rich interactive experience with the following features:

UI Layout:

  • Bilingual support (English/Chinese) with language selector
  • Sidebar containing model selector, system prompt editor, and advanced parameter controls (temperature, top_p, repetition_penalty, max_new_tokens, max_input_tiles for image resolution control)
  • Multi-image upload supporting up to 4 images (PNG, JPG, JPEG, WebP)
  • Gallery examples with pre-loaded images and captions for quick demonstration
  • Chat message display with streaming response rendering

Core Functions:

  • generate_response: Sends conversation messages with base64-encoded images to model workers via streaming HTTP, displays responses with a typing indicator
  • find_bounding_boxes: Parses <ref>/<box> tags in model responses and renders colored bounding boxes with category labels using PIL ImageDraw
  • query_image_generation: Detects drawing-instruction code blocks in responses and calls a Stable Diffusion worker to generate images
  • load_upload_file_and_show: Handles file uploads with MD5 hashing for image caching
  • save_chat_history: Logs conversations to JSON files with timestamps

Post-processing:

  • LaTeX rendering support by converting \[\] and \(\) delimiters to $ signs
  • Phi3-3.8B abnormal character filtering
  • Alias instruction expansion for object detection shortcuts

Usage

Use this application to deploy an interactive web demo for InternVL2 models. It requires a running controller and model worker infrastructure, and optionally a Stable Diffusion worker for image generation.

Code Reference

Source Location

Signature

def get_model_list() -> list
def generate_response(messages) -> str
def find_bounding_boxes(response) -> Optional[Image]
def query_image_generation(response, sd_worker_url, timeout=15) -> Optional[Image]
def load_upload_file_and_show() -> tuple[list[Image], list[str]]
def save_chat_history() -> None
def clear_chat_history() -> None
def pil_image_to_base64(image) -> str
def show_one_or_multiple_images(message, total_image_num, is_input=True) -> None

Import

# Run as a Streamlit application:
# streamlit run streamlit_demo/app.py -- --controller_url http://localhost:10075

I/O Contract

Inputs

Name Type Required Description
--controller_url str No URL of the model controller (default: "http://10.140.60.209:10075")
--sd_worker_url str No URL of the Stable Diffusion worker for image generation (default: "http://0.0.0.0:40006")
--max_image_limit int No Maximum number of images per conversation (default: 4)
User text input str Yes Chat message from the user
Uploaded images PIL.Image No Up to 4 uploaded images (PNG, JPG, JPEG, WebP)

Outputs

Name Type Description
Chat response str Streamed model response with markdown rendering
Bounding box images PIL.Image Images with drawn bounding boxes when model outputs <ref>/<box> tags
Generated images PIL.Image Stable Diffusion generated images when model outputs drawing-instruction blocks
Conversation logs JSON files Logged conversations with timestamps, model info, and message history

Usage Examples

Basic Usage

# Launch the Streamlit demo
# streamlit run streamlit_demo/app.py -- \
#     --controller_url http://localhost:10075 \
#     --sd_worker_url http://localhost:40006 \
#     --max_image_limit 4

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment