Implementation:Lm sys FastChat Build Side By Side Vision Anony UI
| Knowledge Sources | |
|---|---|
| Domains | Web_UI, Model_Evaluation |
| Last Updated | 2026-02-07 06:00 GMT |
Overview
Constructs the anonymous side-by-side vision arena battle tab where users compare image-understanding responses from two randomly assigned, hidden multimodal models.
Description
The build_side_by_side_vision_ui_anony function creates a Gradio UI component that extends the text-only anonymous arena concept to multimodal vision-language models. Users can upload an image via a MultimodalTextbox and submit a text prompt. The system randomly selects two vision-capable models, streams their responses side by side, and keeps model identities hidden until the user votes.
This module heavily reuses infrastructure from both the text anonymous arena and the single-model vision modules. It imports the voting functions (leftvote_last_response, rightvote_last_response, tievote_last_response, bothbad_vote_last_response) and battle pairing logic (get_battle_pair, SAMPLING_WEIGHTS, BATTLE_TARGETS, SAMPLING_BOOST_MODELS, OUTAGE_MODELS) from gradio_block_arena_anony. Image handling utilities (set_visible_image, set_invisible_image, add_image, moderate_input, _prepare_text_with_image, convert_images_to_conversation_format) are imported from gradio_block_arena_vision.
The module defines its own add_text function (line 246) that combines vision-specific input processing with the anonymous battle pairing flow. It validates text and images, runs dual-layer moderation (text and image), selects a battle pair of vision-capable models, and initializes conversation states for both. The load_demo_side_by_side_vision_anony function sets up initial states and model selectors. An optional get_vqa_sample function and random question button allow users to load pre-configured visual question-answering examples for quick testing.
Usage
Use this module when building the anonymous multimodal vision battle tab for Chatbot Arena. It is called from the main multi-model Gradio launcher alongside the text-only arena tabs to provide blind evaluation of vision-language models. This mode generates preference data specifically for ranking multimodal model capabilities.
Code Reference
Source Location
- Repository: Lm_sys_FastChat
- File: fastchat/serve/gradio_block_arena_vision_anony.py
- Lines: 1-680
Signature
def build_side_by_side_vision_ui_anony(context: Context, random_questions=None):
"""
Build the anonymous side-by-side vision arena battle UI.
Args:
context: Global Context object containing model lists and configuration.
random_questions: Optional list of VQA sample dicts for the random example button.
Returns:
list: A list containing two gr.State objects and two model selector
Markdown components [state0, state1, model_selector0, model_selector1].
"""
Import
from fastchat.serve.gradio_block_arena_vision_anony import build_side_by_side_vision_ui_anony
Key Functions
| Function | Line | Description |
|---|---|---|
| build_side_by_side_vision_ui_anony | 378 | Main entry point; constructs the anonymous vision arena Gradio tab |
| load_demo_side_by_side_vision_anony | 107 | Initializes states and model selector visibility on page load |
| get_vqa_sample | 100 | Selects a random VQA sample with question text and image path |
| clear_history_example | 117 | Resets state when loading a random VQA example |
| vote_last_response | 130 | Core vote logger; writes vote data to file and remote logger |
| leftvote_last_response | 173 | Records a vote for the left model (Model A) |
| rightvote_last_response | 183 | Records a vote for the right model (Model B) |
| tievote_last_response | 193 | Records a tie vote between both models |
| bothbad_vote_last_response | 203 | Records a vote indicating both responses were bad |
| regenerate | 213 | Clears the last assistant turn and re-generates responses |
| clear_history | 232 | Resets conversation state, chatbot displays, and image panel |
| add_text | 246 | Vision-specific input pipeline with anonymous battle pairing and image moderation |
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| context | Context | Yes | Global state object from fastchat.serve.gradio_global_state containing vision model lists |
| random_questions | list | No | Optional list of VQA sample dicts with "question" and "path" keys for the random example button |
Outputs
| Name | Type | Description |
|---|---|---|
| returns | list | List of [state0, state1, model_selector0, model_selector1] Gradio State and Markdown components |
Dependencies
Internal Imports
from fastchat.constants import (
TEXT_MODERATION_MSG, IMAGE_MODERATION_MSG, MODERATION_MSG,
CONVERSATION_LIMIT_MSG, SLOW_MODEL_MSG,
BLIND_MODE_INPUT_CHAR_LEN_LIMIT, CONVERSATION_TURN_LIMIT, SURVEY_LINK,
)
from fastchat.model.model_adapter import get_conversation_template
from fastchat.serve.gradio_block_arena_named import flash_buttons
from fastchat.serve.gradio_web_server import (
State, bot_response, get_conv_log_filename,
no_change_btn, enable_btn, disable_btn, invisible_btn,
acknowledgment_md, get_ip, get_model_description_md,
disable_text, enable_text,
)
from fastchat.serve.gradio_block_arena_anony import (
flash_buttons, vote_last_response, leftvote_last_response,
rightvote_last_response, tievote_last_response, bothbad_vote_last_response,
regenerate, clear_history, share_click, bot_response_multi,
set_global_vars_anony, load_demo_side_by_side_anony,
get_sample_weight, get_battle_pair,
SAMPLING_WEIGHTS, BATTLE_TARGETS, SAMPLING_BOOST_MODELS, OUTAGE_MODELS,
)
from fastchat.serve.gradio_block_arena_vision import (
set_invisible_image, set_visible_image, add_image, moderate_input,
enable_multimodal, _prepare_text_with_image,
convert_images_to_conversation_format,
invisible_text, visible_text, disable_multimodal,
)
from fastchat.serve.gradio_global_state import Context
from fastchat.serve.remote_logger import get_remote_logger
from fastchat.utils import build_logger, moderation_filter, image_moderation_filter
External Imports
import json
import time
from typing import Union
import gradio as gr
import numpy as np
Usage Examples
# Building the anonymous vision arena tab within a Gradio Blocks layout
import gradio as gr
from fastchat.serve.gradio_global_state import Context
from fastchat.serve.gradio_block_arena_vision_anony import (
build_side_by_side_vision_ui_anony,
load_demo_side_by_side_vision_anony,
)
context = Context()
context.text_models = ["llava-v1.5-7b", "llava-v1.5-13b", "cogvlm-chat"]
vqa_samples = [
{"question": "What animal is in this image?", "path": "/data/vqa/cat.jpg"},
]
with gr.Blocks() as demo:
with gr.Tab("Vision Arena (battle)"):
states_and_selectors = build_side_by_side_vision_ui_anony(
context,
random_questions=vqa_samples,
)
# On page load, initialize states
demo.load(
load_demo_side_by_side_vision_anony,
[],
states_and_selectors,
)
Related Pages
- Principle:Lm_sys_FastChat_Arena_Battle_UI
- Implements: Principle:Lm_sys_FastChat_Arena_Battle_UI
- Environment:Lm_sys_FastChat_GPU_CUDA_Inference
- Implementation:Lm_sys_FastChat_Build_Side_By_Side_Arena_Anony_UI
- Implementation:Lm_sys_FastChat_Build_Single_Model_Vision_UI
- Implementation:Lm_sys_FastChat_Build_Side_By_Side_Vision_Named_UI