Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sail sg LongSpec Dataset Prompt Templates

From Leeroopedia
Revision as of 13:49, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Sail_sg_LongSpec_Dataset_Prompt_Templates.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Evaluation, Prompt_Engineering
Last Updated 2026-02-14 05:00 GMT

Overview

Concrete tool for formatting evaluation prompts using task-specific templates for LongBench tasks and Qwen2 chat templates for AIME mathematical reasoning.

Description

The dataset2prompt dictionary in inference_long-bench.py maps task names to prompt template strings. For QwQ inference, a Qwen2-specific chat template wraps math problems with system/user/assistant role markers.

This is a Pattern Doc — the templates are defined inline in evaluation scripts.

Usage

Used after data loading and before tokenization in evaluation scripts.

Code Reference

Source Location

  • Repository: LongSpec
  • File (LongBench): longspec/test/inference_long-bench.py
  • Lines: L8-39
  • File (QwQ): longspec/test/inference_qwq.py
  • Lines: L55-67

Signature

# Pattern Doc: Template dictionary and chat template formatting

dataset2prompt = {
    "gov_report": "You are given a report by a government agency. "
                  "Write a one-page summary of the report.\n\n"
                  "Report:\n{context}\n\nNow, write a one-page summary of the report.\n\nSummary:",
    "qmsum": "You are given a meeting transcript and a query containing a question or instruction. "
             "Answer the query in one or more sentences.\n\n"
             "Transcript:\n{context}\n\nQuery: {input}\n\nAnswer:",
    "multi_news": "You are given several news passages. Write a one-page summary of all news.\n\n"
                  "{context}\n\nNow, write a one-page summary of all the news.\n\nSummary:",
    "lcc": "Please complete the code given below.\n{context}",
    "repobench-p": "Please complete the code given below.\n{context}",
}

# QwQ chat template:
qwq_prompt = (
    "<|im_start|>system\nYou are a helpful and harmless assistant. "
    "You should think step-by-step.<|im_end|>\n"
    f"<|im_start|>user\n{{problem}}<|im_end|>\n"
    "<|im_start|>assistant\n"
)

Import

# No separate import — templates are defined inline in evaluation scripts

I/O Contract

Inputs

Name Type Required Description
task str Yes Task name selecting the template
context str Yes Document/article text to include
input str No Optional query/question (used by qmsum)
problem str Yes (AIME) Math problem text for QwQ chat template

Outputs

Name Type Description
formatted_prompt str Complete prompt string ready for tokenization

Usage Examples

LongBench Prompt Formatting

task = "gov_report"
item = {"context": "The Federal Reserve reported...", "input": ""}

prompt = dataset2prompt[task].format(
    context=item["context"],
    input=item.get("input", ""),
)
# Result: "You are given a report by a government agency. Write a one-page summary..."

QwQ Chat Template

problem = "Find the sum of all positive integers n such that..."
prompt = (
    "<|im_start|>system\n"
    "You are a helpful and harmless assistant. You should think step-by-step.<|im_end|>\n"
    f"<|im_start|>user\n{problem}<|im_end|>\n"
    "<|im_start|>assistant\n"
)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment