Implementation:FMInference FlexLLMGen Prompt Construction Pattern

Field	Value
Sources	FlexLLMGen
Domains	Prompt_Engineering, NLP
Last Updated	2026-02-09 00:00 GMT

Overview

Pattern documentation for constructing few-shot text completion prompts as used by FlexLLMGen's completion application.

Description

This is a Pattern Doc -- it documents a user-defined pattern rather than a library API. FlexLLMGen's completion.py demonstrates prompt construction as a list of Python strings, where each string is a multi-line prompt with few-shot examples followed by the query. Prompts use newline-delimited Question/Answer pairs or Text/Extraction pairs. The prompt list length must equal gpu_batch_size (since num_gpu_batches=1 in the completion example).

Code Reference

Source: flexllmgen/apps/completion.py, Lines: 11-22
Import: No import needed -- this is a user-defined pattern

Pattern interface:

# User-defined prompt construction pattern
# Each prompt is a multi-line string with few-shot examples

prompts = [
    # Q&A format
    "Question: Where were the 2004 Olympics held?\n"
    "Answer: Athens, Greece\n"
    "Question: What is the longest river on the earth?\n"
    "Answer:",

    # Extraction format
    "Extract the airport codes from this text.\n"
    "Text: \"I want a flight from New York to San Francisco.\"\n"
    "Airport codes: JFK, SFO.\n"
    "Text: \"I want you to book a flight from Phoenix to Las Vegas.\"\n"
    "Airport codes:",
]

# Constraint: len(prompts) == policy.gpu_batch_size * policy.num_gpu_batches

I/O Contract

Direction	Name	Description
Input	Task description	Conceptual description of what the model should do
Input	Few-shot examples	Labeled demonstrations establishing the input-output format
Input	Query	The actual input to process
Output	prompts	List[str], list of prompt strings ready for tokenization, length must match batch size

Usage Examples

# Example 1: Question-Answering prompt
prompts = [
    "Question: What is the capital of France?\n"
    "Answer: Paris\n"
    "Question: What is the capital of Japan?\n"
    "Answer:",
]

# Example 2: Classification prompt
prompts = [
    "Classify the sentiment: 'Great product!' -> Positive\n"
    "Classify the sentiment: 'Terrible experience.' -> Negative\n"
    "Classify the sentiment: 'It works fine.' ->",
]

# Pass to tokenizer and model
inputs = tokenizer(prompts, padding="max_length", max_length=128)
output_ids = model.generate(inputs.input_ids, max_new_tokens=32, stop=stop)

Related Pages

Principle:FMInference_FlexLLMGen_Input_Prompt_Design

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment