Implementation:Sail sg LongSpec Dataset Prompt Templates
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Prompt_Engineering |
| Last Updated | 2026-02-14 05:00 GMT |
Overview
Concrete tool for formatting evaluation prompts using task-specific templates for LongBench tasks and Qwen2 chat templates for AIME mathematical reasoning.
Description
The dataset2prompt dictionary in inference_long-bench.py maps task names to prompt template strings. For QwQ inference, a Qwen2-specific chat template wraps math problems with system/user/assistant role markers.
This is a Pattern Doc — the templates are defined inline in evaluation scripts.
Usage
Used after data loading and before tokenization in evaluation scripts.
Code Reference
Source Location
- Repository: LongSpec
- File (LongBench): longspec/test/inference_long-bench.py
- Lines: L8-39
- File (QwQ): longspec/test/inference_qwq.py
- Lines: L55-67
Signature
# Pattern Doc: Template dictionary and chat template formatting
dataset2prompt = {
"gov_report": "You are given a report by a government agency. "
"Write a one-page summary of the report.\n\n"
"Report:\n{context}\n\nNow, write a one-page summary of the report.\n\nSummary:",
"qmsum": "You are given a meeting transcript and a query containing a question or instruction. "
"Answer the query in one or more sentences.\n\n"
"Transcript:\n{context}\n\nQuery: {input}\n\nAnswer:",
"multi_news": "You are given several news passages. Write a one-page summary of all news.\n\n"
"{context}\n\nNow, write a one-page summary of all the news.\n\nSummary:",
"lcc": "Please complete the code given below.\n{context}",
"repobench-p": "Please complete the code given below.\n{context}",
}
# QwQ chat template:
qwq_prompt = (
"<|im_start|>system\nYou are a helpful and harmless assistant. "
"You should think step-by-step.<|im_end|>\n"
f"<|im_start|>user\n{{problem}}<|im_end|>\n"
"<|im_start|>assistant\n"
)
Import
# No separate import — templates are defined inline in evaluation scripts
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| task | str | Yes | Task name selecting the template |
| context | str | Yes | Document/article text to include |
| input | str | No | Optional query/question (used by qmsum) |
| problem | str | Yes (AIME) | Math problem text for QwQ chat template |
Outputs
| Name | Type | Description |
|---|---|---|
| formatted_prompt | str | Complete prompt string ready for tokenization |
Usage Examples
LongBench Prompt Formatting
task = "gov_report"
item = {"context": "The Federal Reserve reported...", "input": ""}
prompt = dataset2prompt[task].format(
context=item["context"],
input=item.get("input", ""),
)
# Result: "You are given a report by a government agency. Write a one-page summary..."
QwQ Chat Template
problem = "Find the sum of all positive integers n such that..."
prompt = (
"<|im_start|>system\n"
"You are a helpful and harmless assistant. You should think step-by-step.<|im_end|>\n"
f"<|im_start|>user\n{problem}<|im_end|>\n"
"<|im_start|>assistant\n"
)
Related Pages
Implements Principle
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment