Principle:Sail sg LongSpec Prompt Formatting

Knowledge Sources	LongSpec
Domains	NLP, Evaluation, Prompt_Engineering
Last Updated	2026-02-14 05:00 GMT

Overview

Principle for constructing task-appropriate prompts from raw benchmark data using template-based formatting and model-specific chat templates.

Description

Prompt Formatting bridges raw evaluation data and model input by applying task-specific templates. Two distinct formatting patterns are used:

LongBench tasks: Simple string templates with {context} and {input} placeholders. Each task (gov_report, qmsum, multi_news, lcc, repobench-p) has a dedicated prompt template that provides task instructions and formats the source material.

AIME/QwQ tasks: Qwen2 chat template format with system, user, and assistant roles using special tokens (<|im_start|>, <|im_end|>). The math problem is wrapped in a conversational structure that triggers chain-of-thought reasoning.

After formatting, prompts are tokenized using the target model's tokenizer and transferred to CUDA for inference.

Usage

Apply when preparing prompts for evaluation. The prompt format must match the target model's training format—Llama-based models use plain text templates while Qwen2-based models (QwQ) require the chat template format.

Theoretical Basis

Prompt formatting follows the template interpolation pattern where task context is inserted into a fixed instruction frame:

# Abstract pattern (not actual implementation)
formatted = template.format(context=raw_data["context"], input=raw_data["input"])
input_ids = tokenizer(formatted, return_tensors="pt").input_ids.cuda()
prompt_length = input_ids.shape[1]  # Generation starts after this position

Related Pages

Implemented By

Implementation:Sail_sg_LongSpec_Dataset_Prompt_Templates

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment