Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:OpenBMB UltraFeedback Prompt Formatting

From Leeroopedia
Revision as of 17:23, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/OpenBMB_UltraFeedback_Prompt_Formatting.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains NLP, Prompt_Engineering, Inference
Last Updated 2023-10-02 00:00 GMT

Overview

A template-based prompt construction system that formats instructions and system prompts into model-specific conversation formats for consistent multi-model inference.

Description

Prompt Formatting addresses the challenge that different LLM architectures expect different input formats. A prompt that works well for LLaMA-2 (which uses [INST] and <<SYS>> tags) will produce garbled output on Vicuna (which uses USER: and ASSISTANT: roles) or MPT (which uses <|im_start|> and <|im_end|> tags).

The UltraFeedback pipeline uses a Conversation dataclass (adapted from the FastChat library) that encapsulates prompt formatting for 14 different separator styles. Each model family maps to a pre-defined Conversation template with the correct system prompt format, role names, separator tokens, and stop criteria.

The formatting process:

  1. Copy the appropriate template for the model family
  2. Append the principle system prompt to the template's system field
  3. Add the user instruction as the first message
  4. Add an empty assistant message (to trigger generation)
  5. Call get_prompt() to render the formatted string

Special cases exist for UltraLM (custom -delimited format), StarChat (<|system|>/<|end|> tags), and WizardLM-7B (simple ### Response: format).

Usage

Use this principle whenever you need to generate completions from multiple LLM architectures using a single pipeline. Each model must receive its input in the format it was trained on; incorrect formatting degrades output quality or causes the model to treat formatting tokens as content.

Theoretical Basis

Different LLMs are trained with different conversation formats. The key insight is that the separator style (how roles, messages, and turns are delimited) is the primary axis of variation. The Conversation dataclass captures this with a SeparatorStyle enum supporting 14 styles:

  • ADD_COLON_SINGLE/TWO: "ROLE: message" with one or two separators (Vicuna, Alpaca)
  • LLAMA2: Special [INST] and <<SYS>> tag-based format
  • CHATML: <|im_start|>/<|im_end|> format (MPT)
  • RWKV: Newline-based format (Falcon)
  • And others: NO_COLON, ADD_NEW_LINE, CHATGLM, CHATINTERN, DOLLY, PHOENIX, ROBIN

Pseudo-code Logic:

# Abstract algorithm
def format_prompt(model_family: str, instruction: str, principle_prompt: str) -> str:
    template = conv_template[model_family].copy()
    template.system += " " + principle_prompt
    template.append_message(template.roles[0], instruction)  # user turn
    template.append_message(template.roles[1], None)          # empty assistant turn
    return template.get_prompt()

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment