Implementation:Microsoft LoRA Run Generation

Overview

The run_generation.py script performs conditional text generation using auto-regressive language models (GPT-2, CTRL, OpenAI-GPT, XLNet, Transformer-XL, XLM) with configurable sampling parameters.

Description

This script provides a command-line interface for generating text with six supported auto-regressive model families. It handles model-specific input preprocessing, tokenization, generation, and output formatting.

Supported Models (via MODEL_CLASSES dict):

Model Type Key	Model Class	Tokenizer Class
`gpt2`	`GPT2LMHeadModel`	`GPT2Tokenizer`
`ctrl`	`CTRLLMHeadModel`	`CTRLTokenizer`
`openai-gpt`	`OpenAIGPTLMHeadModel`	`OpenAIGPTTokenizer`
`xlnet`	`XLNetLMHeadModel`	`XLNetTokenizer`
`transfo-xl`	`TransfoXLLMHeadModel`	`TransfoXLTokenizer`
`xlm`	`XLMWithLMHeadModel`	`XLMTokenizer`

Model-specific Preprocessing (via PREPROCESSING_FUNCTIONS dict):

CTRL: Warns if temperature > 0.7 (CTRL works better with low temperature) and validates that the prompt starts with a control code.
XLM: Sets the language ID on the model config when language embeddings are available.
XLNet / Transformer-XL: Prepends a padding text (a historical narrative) to help the model with short prompts, following the approach by Aman Rusia.

Generation Pipeline:

Parse arguments, set random seed, load model and tokenizer.
Apply model-specific preprocessing to the prompt.
Encode the prompt and call model.generate() with temperature, top-k, top-p, repetition penalty, and number of return sequences.
Decode generated tokens, optionally truncate at a stop token, and prepend the original prompt.

MAX_LENGTH is hardcoded at 10000 to prevent infinite generation loops. The adjust_length_to_model() helper clips the requested length to the model's max_position_embeddings.

Usage

Use this script when:

Generating text samples from auto-regressive language models for experimentation.
Comparing generation quality across different model architectures (GPT-2, CTRL, XLNet, etc.).
Prototyping text generation pipelines with controllable sampling parameters.

Code Reference

Source Location

examples/NLU/examples/text-generation/run_generation.py (295 lines)

Signature

MAX_LENGTH = int(10000)

MODEL_CLASSES = {
    "gpt2": (GPT2LMHeadModel, GPT2Tokenizer),
    "ctrl": (CTRLLMHeadModel, CTRLTokenizer),
    "openai-gpt": (OpenAIGPTLMHeadModel, OpenAIGPTTokenizer),
    "xlnet": (XLNetLMHeadModel, XLNetTokenizer),
    "transfo-xl": (TransfoXLLMHeadModel, TransfoXLTokenizer),
    "xlm": (XLMWithLMHeadModel, XLMTokenizer),
}

PREPROCESSING_FUNCTIONS = {
    "ctrl": prepare_ctrl_input,
    "xlm": prepare_xlm_input,
    "xlnet": prepare_xlnet_input,
    "transfo-xl": prepare_transfoxl_input,
}

def set_seed(args) -> None: ...
def prepare_ctrl_input(args, _, tokenizer, prompt_text) -> str: ...
def prepare_xlm_input(args, model, tokenizer, prompt_text) -> str: ...
def prepare_xlnet_input(args, _, tokenizer, prompt_text) -> str: ...
def prepare_transfoxl_input(args, _, tokenizer, prompt_text) -> str: ...
def adjust_length_to_model(length: int, max_sequence_length: int) -> int: ...
def main() -> list: ...

Import / CLI Usage

python examples/text-generation/run_generation.py \
    --model_type gpt2 \
    --model_name_or_path gpt2 \
    --prompt "The future of AI is" \
    --length 100 \
    --temperature 0.7 \
    --k 50 \
    --p 0.95 \
    --num_return_sequences 3

I/O Contract

Inputs

Input	Type	Description
`--model_type`	str (required)	Model architecture: `gpt2`, `ctrl`, `openai-gpt`, `xlnet`, `transfo-xl`, `xlm`
`--model_name_or_path`	str (required)	Pretrained model name or local path
`--prompt`	str	Input prompt text; if empty, prompts interactively
`--length`	int	Number of tokens to generate. Default: 20
`--stop_token`	str	Token at which to stop generation
`--temperature`	float	Sampling temperature (1.0 = no change). Default: 1.0
`--repetition_penalty`	float	Repetition penalty (useful for CTRL at 1.2). Default: 1.0
`--k`	int	Top-k filtering. Default: 0 (disabled)
`--p`	float	Top-p (nucleus) filtering. Default: 0.9
`--seed`	int	Random seed. Default: 42
`--num_return_sequences`	int	Number of sequences to generate. Default: 1
`--fp16`	flag	Use half-precision inference
`--no_cuda`	flag	Disable CUDA
`--prefix`	str	Text prepended to the prompt
`--xlm_language`	str	Language code for XLM model

Outputs

Output	Type	Description
Generated sequences	stdout	Printed to console with `=== GENERATED SEQUENCE N ===` headers
Return value	list of str	List of generated text sequences (when called programmatically)

Usage Examples

# Generate text with GPT-2
python examples/text-generation/run_generation.py \
    --model_type gpt2 \
    --model_name_or_path gpt2-medium \
    --prompt "Once upon a time" \
    --length 200 \
    --temperature 0.8 \
    --k 40 \
    --p 0.95 \
    --seed 42

# Output:
# === GENERATED SEQUENCE 1 ===
# Once upon a time, in a land far away...

# Generate with CTRL using control code
python examples/text-generation/run_generation.py \
    --model_type ctrl \
    --model_name_or_path ctrl \
    --prompt "Links A new study shows" \
    --length 100 \
    --temperature 0.5 \
    --repetition_penalty 1.2

# Generate multiple sequences with XLNet
python examples/text-generation/run_generation.py \
    --model_type xlnet \
    --model_name_or_path xlnet-base-cased \
    --prompt "The meaning of life is" \
    --length 50 \
    --num_return_sequences 5

Related Pages

Environment:Microsoft_LoRA_NLU_Conda_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment