Implementation:Lm sys FastChat Clean Battle Data

Knowledge Sources	Lm_sys_FastChat Chatbot Arena
Domains	Model_Evaluation, Data_Cleaning
Last Updated	2026-02-07 06:00 GMT

Overview

clean_battle_data filters, validates, and deduplicates raw arena battle log files to produce a clean DataFrame suitable for Elo rating computation and statistical analysis.

Description

The clean_battle_data.py module is a critical preprocessing step in the Chatbot Arena analytics pipeline. Raw battle logs collected from the live arena contain noise from various sources: bot traffic, duplicate submissions, banned users, malformed conversations, and invalid model pairings. This module systematically removes these artifacts to produce a reliable dataset for downstream rating computations.

The primary function, clean_battle_data, accepts a list of log file paths along with filtering parameters and returns a pandas DataFrame of validated battle records. The cleaning process applies multiple filters in sequence: it removes battles involving excluded model names, filters out requests from banned IP addresses, deduplicates battles based on conversation content hashes, validates that each battle contains properly formatted conversation turns, and checks for minimum conversation length requirements.

The module supports two operating modes controlled by the mode parameter. In the default mode, it applies standard filtering suitable for general leaderboard computation. An alternative strict mode applies additional constraints for research-quality datasets, such as requiring longer conversations and enforcing stricter deduplication thresholds. The sanitize_ip flag controls whether IP addresses are hashed in the output for privacy compliance.

Usage

Use this module as the first step in any arena data analysis workflow. It should be called before computing Elo ratings, generating leaderboard tables, or performing any statistical analysis on battle data. The cleaned output is consumed by elo_analysis.py, rating_systems.py, and the monitor dashboard.

Code Reference

Source Location

Repository: Lm_sys_FastChat
File: fastchat/serve/monitor/clean_battle_data.py
Lines: 1-423

Signature

def clean_battle_data(
    log_files: list[str],
    exclude_model_names: list[str] = None,
    ban_ip_list: list[str] = None,
    sanitize_ip: bool = False,
    mode: str = "default",
) -> pd.DataFrame:
    """Clean and validate arena battle log data for Elo rating computation.

    Args:
        log_files: Paths to raw battle log JSON files.
        exclude_model_names: Model names to exclude from results.
        ban_ip_list: IP addresses to filter out.
        sanitize_ip: Whether to hash IP addresses in output.
        mode: Cleaning mode, either "default" or "strict".

    Returns:
        A pandas DataFrame of cleaned battle records.
    """
    ...

Import

from fastchat.serve.monitor.clean_battle_data import clean_battle_data

I/O Contract

Inputs

Name	Type	Required	Description
log_files	`list[str]`	Yes	List of file paths to raw arena battle log JSON files
exclude_model_names	`list[str]`	No	Model names to exclude from the cleaned dataset (default: `None`)
ban_ip_list	`list[str]`	No	IP addresses to filter out from battle records (default: `None`)
sanitize_ip	`bool`	No	If `True`, hash IP addresses in the output for privacy (default: `False`)
mode	`str`	No	Cleaning mode: `"default"` for standard filtering, `"strict"` for research-quality constraints (default: `"default"`)

Outputs

Name	Type	Description
battles_df	`pd.DataFrame`	Cleaned DataFrame with columns: `model_a`, `model_b`, `winner`, `conversation_a`, `conversation_b`, `judge`, `turn`, `tstamp`, `ip` (hashed if sanitized)

Usage Examples

from fastchat.serve.monitor.clean_battle_data import clean_battle_data

# Clean battle data with default settings
log_files = [
    "logs/battles_2024_01.json",
    "logs/battles_2024_02.json",
]
battles_df = clean_battle_data(log_files)
print(f"Cleaned battles: {len(battles_df)}")
print(battles_df.head())

# Clean with strict mode and IP sanitization
battles_strict = clean_battle_data(
    log_files,
    exclude_model_names=["deprecated-model-v1"],
    ban_ip_list=["192.168.1.100"],
    sanitize_ip=True,
    mode="strict",
)
print(f"Strict cleaned battles: {len(battles_strict)}")

# Check model distribution in cleaned data
print(battles_strict["model_a"].value_counts().head(10))

Related Pages

Principle:Lm_sys_FastChat_Battle_Data_Cleaning
Implements: Principle:Lm_sys_FastChat_Battle_Data_Cleaning
Environment:Lm_sys_FastChat_GPU_CUDA_Inference
Lm_sys_FastChat_Elo_Analysis - Consumes cleaned battle data for Elo computation
Lm_sys_FastChat_Rating_Systems - Statistical rating systems applied to cleaned data
Lm_sys_FastChat_Monitor_Dashboard - Displays results derived from cleaned data
Lm_sys_FastChat_Category_Label_Pipeline - Labels cleaned conversations with categories

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment