Implementation:Lm sys FastChat Build Side By Side Vision Named UI
| Knowledge Sources | |
|---|---|
| Domains | Web_UI, Model_Evaluation |
| Last Updated | 2026-02-07 06:00 GMT |
Overview
Constructs the named side-by-side vision comparison tab where users choose which two multimodal models to compare on image-understanding tasks.
Description
The build_side_by_side_vision_ui_named function creates a Gradio UI component that allows users to select two specific vision-language models from dropdown menus and compare their responses to image-based prompts side by side. Users upload an image and submit a text query; both selected models process the same image and prompt, with responses streamed in parallel.
The load_demo_side_by_side_vision_named function initializes the tab by populating model dropdowns from context.text_models. The left model defaults to the first available model, while the right model is selected using uniform random weights across remaining models. This initialization uses the Context object from fastchat.serve.gradio_global_state, which provides centralized access to available model lists.
The module reuses key infrastructure from related modules. Voting functions (leftvote_last_response, rightvote_last_response, tievote_last_response, bothbad_vote_last_response) follow the standard arena pattern of logging to both local files and remote loggers. The flash_buttons, share_click, and bot_response_multi functions are imported from gradio_block_arena_named. Image handling utilities (get_vqa_sample, set_visible_image, set_invisible_image, add_image, moderate_input, _prepare_text_with_image, convert_images_to_conversation_format) are imported from gradio_block_arena_vision. The local add_text function (line 190) combines named-model selection with vision input processing, running content moderation on both text and images before constructing conversation states for each selected model.
Usage
Use this module when building the named multimodal vision comparison tab for Chatbot Arena. It complements the anonymous vision battle by allowing users to deliberately choose which vision-language models to evaluate against each other, useful for targeted benchmarking of specific model capabilities on visual reasoning tasks.
Code Reference
Source Location
- Repository: Lm_sys_FastChat
- File: fastchat/serve/gradio_block_arena_vision_named.py
- Lines: 1-581
Signature
def build_side_by_side_vision_ui_named(context: Context, random_questions=None):
"""
Build the named side-by-side vision comparison UI.
Args:
context: Global Context object containing model lists and configuration.
random_questions: Optional list of VQA sample dicts for the random example button.
Returns:
list: A list containing two gr.State objects and two model selector
Dropdown components [state0, state1, model_selector0, model_selector1].
"""
Import
from fastchat.serve.gradio_block_arena_vision_named import build_side_by_side_vision_ui_named
Key Functions
| Function | Line | Description |
|---|---|---|
| build_side_by_side_vision_ui_named | 305 | Main entry point; constructs the named vision comparison Gradio tab |
| load_demo_side_by_side_vision_named | 72 | Initializes states and populates model dropdowns from context |
| clear_history_example | 95 | Resets state when loading a random VQA example |
| vote_last_response | 106 | Core vote logger; writes vote data to file and remote logger |
| leftvote_last_response | 120 | Records a vote for the left model |
| rightvote_last_response | 130 | Records a vote for the right model |
| tievote_last_response | 140 | Records a tie vote between both models |
| bothbad_vote_last_response | 150 | Records a vote indicating both responses were bad |
| regenerate | 160 | Clears the last assistant turn and re-generates responses |
| clear_history | 179 | Resets conversation state, chatbot displays, and image panel |
| add_text | 190 | Named-model vision input pipeline with image moderation and state initialization |
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| context | Context | Yes | Global state object from fastchat.serve.gradio_global_state containing vision model lists and configuration |
| random_questions | list | No | Optional list of VQA sample dicts with "question" and "path" keys for the random example button |
Outputs
| Name | Type | Description |
|---|---|---|
| returns | list | List of [state0, state1, model_selector0, model_selector1] Gradio State and Dropdown components |
Dependencies
Internal Imports
from fastchat.constants import (
TEXT_MODERATION_MSG, IMAGE_MODERATION_MSG, MODERATION_MSG,
CONVERSATION_LIMIT_MSG, SLOW_MODEL_MSG,
INPUT_CHAR_LEN_LIMIT, CONVERSATION_TURN_LIMIT, SURVEY_LINK,
)
from fastchat.model.model_adapter import get_conversation_template
from fastchat.serve.gradio_block_arena_named import (
flash_buttons, share_click, bot_response_multi,
)
from fastchat.serve.gradio_block_arena_vision import (
get_vqa_sample, set_invisible_image, set_visible_image,
add_image, moderate_input, _prepare_text_with_image,
convert_images_to_conversation_format,
enable_multimodal, disable_multimodal,
invisible_text, invisible_btn, visible_text,
)
from fastchat.serve.gradio_global_state import Context
from fastchat.serve.gradio_web_server import (
State, bot_response, get_conv_log_filename,
no_change_btn, enable_btn, disable_btn, invisible_btn,
acknowledgment_md, get_ip, get_model_description_md, enable_text,
)
from fastchat.serve.remote_logger import get_remote_logger
from fastchat.utils import build_logger, moderation_filter, image_moderation_filter
External Imports
import json
import os
import time
from typing import List, Union
import gradio as gr
import numpy as np
Usage Examples
# Building the named vision comparison tab within a Gradio Blocks layout
import gradio as gr
from fastchat.serve.gradio_global_state import Context
from fastchat.serve.gradio_block_arena_vision_named import (
build_side_by_side_vision_ui_named,
load_demo_side_by_side_vision_named,
)
context = Context()
context.text_models = ["llava-v1.5-7b", "llava-v1.5-13b", "cogvlm-chat"]
vqa_samples = [
{"question": "Describe this image in detail.", "path": "/data/vqa/scene.jpg"},
]
with gr.Blocks() as demo:
with gr.Tab("Vision Arena (side-by-side)"):
states_and_selectors = build_side_by_side_vision_ui_named(
context,
random_questions=vqa_samples,
)
# On page load, initialize states with model dropdowns
demo.load(
load_demo_side_by_side_vision_named,
[context],
states_and_selectors,
)
Related Pages
- Principle:Lm_sys_FastChat_Arena_Battle_UI
- Implements: Principle:Lm_sys_FastChat_Arena_Battle_UI
- Environment:Lm_sys_FastChat_GPU_CUDA_Inference
- Implementation:Lm_sys_FastChat_Build_Side_By_Side_Arena_Named_UI
- Implementation:Lm_sys_FastChat_Build_Single_Model_Vision_UI
- Implementation:Lm_sys_FastChat_Build_Side_By_Side_Vision_Anony_UI