Implementation:Lm sys FastChat Category Classifier
| Knowledge Sources | |
|---|---|
| Domains | Model_Evaluation, NLP_Classification |
| Last Updated | 2026-02-07 06:00 GMT |
Overview
CategoryClassifier provides LLM-based category classification for user prompts in the Chatbot Arena, enabling fine-grained analysis of conversation topics across arena battles.
Description
The category.py module implements the CategoryClassifier class, which classifies user prompts into predefined topic categories using large language model API calls. This classification is essential for computing category-specific Elo ratings and understanding how different models perform across various conversation domains such as coding, math, reasoning, creative writing, and general knowledge.
The classifier works by constructing a structured prompt that asks an LLM to assign one or more category labels to a given user message. It supports configurable category taxonomies and can be extended with new categories as the arena evolves. The classification prompt is carefully engineered to produce consistent, reproducible labels across diverse input types.
To handle large-scale datasets efficiently, the module provides batch classification capabilities with parallel API calls. Multiple prompts can be classified concurrently using asynchronous request patterns, significantly reducing the wall-clock time required to label entire conversation logs. Rate limiting and retry logic are built in to handle API throttling gracefully.
Usage
Use this module when you need to assign topic categories to user prompts from arena battle logs. This is typically invoked as part of the data preprocessing pipeline before computing category-specific leaderboard rankings. It is also useful for exploratory analysis of conversation distributions across the arena.
Code Reference
Source Location
- Repository: Lm_sys_FastChat
- File: fastchat/serve/monitor/classify/category.py
- Lines: 1-579
Signature
class CategoryClassifier:
def __init__(self, category_list: list[str], model_name: str = "gpt-4", temperature: float = 0.0):
...
def classify(self, prompt: str) -> list[str]:
"""Classify a single user prompt into one or more categories."""
...
def classify_batch(self, prompts: list[str], num_workers: int = 8) -> list[list[str]]:
"""Classify a batch of prompts using parallel API calls."""
...
def build_classification_prompt(self, user_prompt: str) -> str:
"""Construct the LLM prompt used for category classification."""
...
Import
from fastchat.serve.monitor.classify.category import CategoryClassifier
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| category_list | list[str] |
Yes | List of valid category names for classification |
| model_name | str |
No | LLM model to use for classification (default: "gpt-4")
|
| temperature | float |
No | Sampling temperature for LLM calls (default: 0.0)
|
| prompt | str |
Yes (for classify) | The user prompt text to classify |
| prompts | list[str] |
Yes (for classify_batch) | List of user prompt texts to classify in batch |
| num_workers | int |
No | Number of parallel workers for batch classification (default: 8)
|
Outputs
| Name | Type | Description |
|---|---|---|
| categories | list[str] |
List of category labels assigned to a single prompt |
| batch_categories | list[list[str]] |
List of category label lists for each prompt in a batch |
Usage Examples
from fastchat.serve.monitor.classify.category import CategoryClassifier
# Initialize the classifier with predefined categories
categories = ["coding", "math", "reasoning", "creative_writing", "general"]
classifier = CategoryClassifier(category_list=categories, model_name="gpt-4")
# Classify a single prompt
user_prompt = "Write a Python function to compute the Fibonacci sequence."
labels = classifier.classify(user_prompt)
print(labels) # e.g., ["coding"]
# Classify a batch of prompts in parallel
prompts = [
"Explain the theory of relativity.",
"Write a haiku about autumn.",
"Solve the integral of x^2 dx.",
]
batch_labels = classifier.classify_batch(prompts, num_workers=4)
for prompt, labels in zip(prompts, batch_labels):
print(f"{prompt[:40]}... -> {labels}")
Related Pages
- Principle:Lm_sys_FastChat_LLM_Prompt_Classification
- Implements: Principle:Lm_sys_FastChat_LLM_Prompt_Classification
- Environment:Lm_sys_FastChat_GPU_CUDA_Inference
- Lm_sys_FastChat_Category_Label_Pipeline - Batch pipeline that uses CategoryClassifier
- Lm_sys_FastChat_Elo_Analysis - Consumes category labels for per-category Elo ratings
- Lm_sys_FastChat_Clean_Battle_Data - Preprocesses battle data before classification