Principle:OpenBMB UltraFeedback Prompt Formatting
| Knowledge Sources | |
|---|---|
| Domains | NLP, Prompt_Engineering, Inference |
| Last Updated | 2023-10-02 00:00 GMT |
Overview
A template-based prompt construction system that formats instructions and system prompts into model-specific conversation formats for consistent multi-model inference.
Description
Prompt Formatting addresses the challenge that different LLM architectures expect different input formats. A prompt that works well for LLaMA-2 (which uses [INST] and <<SYS>> tags) will produce garbled output on Vicuna (which uses USER: and ASSISTANT: roles) or MPT (which uses <|im_start|> and <|im_end|> tags).
The UltraFeedback pipeline uses a Conversation dataclass (adapted from the FastChat library) that encapsulates prompt formatting for 14 different separator styles. Each model family maps to a pre-defined Conversation template with the correct system prompt format, role names, separator tokens, and stop criteria.
The formatting process:
- Copy the appropriate template for the model family
- Append the principle system prompt to the template's system field
- Add the user instruction as the first message
- Add an empty assistant message (to trigger generation)
- Call get_prompt() to render the formatted string
Special cases exist for UltraLM (custom -delimited format), StarChat (<|system|>/<|end|> tags), and WizardLM-7B (simple ### Response: format).
Usage
Use this principle whenever you need to generate completions from multiple LLM architectures using a single pipeline. Each model must receive its input in the format it was trained on; incorrect formatting degrades output quality or causes the model to treat formatting tokens as content.
Theoretical Basis
Different LLMs are trained with different conversation formats. The key insight is that the separator style (how roles, messages, and turns are delimited) is the primary axis of variation. The Conversation dataclass captures this with a SeparatorStyle enum supporting 14 styles:
- ADD_COLON_SINGLE/TWO: "ROLE: message" with one or two separators (Vicuna, Alpaca)
- LLAMA2: Special [INST] and <<SYS>> tag-based format
- CHATML: <|im_start|>/<|im_end|> format (MPT)
- RWKV: Newline-based format (Falcon)
- And others: NO_COLON, ADD_NEW_LINE, CHATGLM, CHATINTERN, DOLLY, PHOENIX, ROBIN
Pseudo-code Logic:
# Abstract algorithm
def format_prompt(model_family: str, instruction: str, principle_prompt: str) -> str:
template = conv_template[model_family].copy()
template.system += " " + principle_prompt
template.append_message(template.roles[0], instruction) # user turn
template.append_message(template.roles[1], None) # empty assistant turn
return template.get_prompt()