Implementation:Hpcaitech ColossalAI AGIEvalDataset

Knowledge Sources	Hpcaitech_ColossalAI
Domains	Evaluation, Benchmarking
Last Updated	2026-02-09 00:00 GMT

Overview

AGIEvalDataset is a dataset wrapper class that loads and converts AGIEval benchmark data into the ColossalEval inference format, supporting both English and Chinese question-answering and cloze-style tasks.

Description

The class extends BaseDataset and provides a static load method that reads JSONL files from the AGIEval dataset directory. It handles multiple subcategories including English QA datasets (LSAT, SAT, AQUA-RAT), Chinese QA datasets (LogiQA, JEC-QA, Gaokao subjects), and cloze datasets for both languages. The module also includes helper functions get_prompt for formatting individual questions and combine_prompt for constructing few-shot demonstration prompts from CSV files.

Usage

Use this class when you need to evaluate a language model on the AGIEval benchmark within the ColossalEval framework. It is instantiated with a path to the AGIEval data directory and optionally supports few-shot prompting.

Code Reference

Source Location

Repository: Hpcaitech_ColossalAI
File: applications/ColossalEval/colossal_eval/dataset/agieval.py
Lines: 1-267

Signature

class AGIEvalDataset(BaseDataset):
    @staticmethod
    def load(path: str, logger: DistributedLogger, few_shot: bool, *args, **kwargs) -> List[Dict]:

Import

from colossal_eval.dataset.agieval import AGIEvalDataset

I/O Contract

Inputs

Name	Type	Required	Description
path	str	Yes	Path to the directory containing AGIEval JSONL files and optional few_shot_prompts.csv
logger	DistributedLogger	Yes	Logger instance for distributed logging
few_shot	bool	Yes	Whether to load few-shot demonstration prompts from CSV

Outputs

Name	Type	Description
dataset	Dict[str, Dict]	A nested dictionary with split "test" containing subcategories, each with "data" (list of data samples) and "inference_kwargs" (inference configuration including calculate_loss, all_classes, language, max_new_tokens, and few_shot_data)

Usage Examples

from colossal_eval.dataset.agieval import AGIEvalDataset
from colossalai.logging import DistributedLogger

logger = DistributedLogger("agieval")
dataset = AGIEvalDataset(path="/path/to/agieval/data", logger=logger, few_shot=True)
dataset.save("/path/to/output.json")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment