Implementation:Microsoft DeepSpeedExamples ZenFlow Finetune Llama

Knowledge Sources	Microsoft_DeepSpeedExamples
Domains	Deep Learning, Fine Tuning, Large Language Models
Last Updated	2026-02-07 12:00 GMT

Overview

A LLaMA fine-tuning script using DeepSpeed ZenFlow that preprocesses Alpaca instruction data and trains a causal language model with distributed training support.

Description

This module implements an end-to-end fine-tuning pipeline for LLaMA-family models using the DeepSpeed ZenFlow optimization framework. The main function orchestrates the full workflow: setting a reproducible seed, loading a tokenizer and model from HuggingFace with bfloat16 precision, tokenizing the tatsu-lab/alpaca instruction-following dataset, initializing DeepSpeed, and running the training loop.

The preprocess_alpaca function formats each Alpaca example into a structured prompt template with ### Instruction:, optional ### Input:, and ### Response: sections, then tokenizes with truncation and padding to a configurable max_length (default 512). Labels are set to a copy of the input IDs for causal language model training. The set_seed function ensures reproducibility across random, numpy, and torch (both CPU and CUDA) random number generators.

The training loop uses deepspeed.initialize to wrap the model with a DeepSpeed engine, which automatically manages the optimizer, learning rate scheduler, gradient accumulation, and distributed communication. Each step logs loss and wall-clock time on rank 0, and the final checkpoint is saved using model_engine.save_checkpoint alongside the tokenizer.

Usage

Use this script to fine-tune LLaMA or similar causal language models on instruction-following data with DeepSpeed ZenFlow. Launch via the DeepSpeed distributed launcher with a JSON configuration file specifying ZeRO stage, batch size, learning rate, and other DeepSpeed settings.

Code Reference

Source Location

Repository: Microsoft_DeepSpeedExamples
File: training/DeepSpeed-ZenFlow/finetuning/finetune_llama.py
Lines: 1-112

Signature

def set_seed(seed) -> None:
def preprocess_alpaca(example, tokenizer, max_length=512) -> dict:
def main(args) -> None:

Import

from finetune_llama import main, preprocess_alpaca, set_seed

I/O Contract

Inputs

Name	Type	Required	Description
args.model_name	str	Yes	HuggingFace model identifier for the LLaMA model to fine-tune
args.lr	float	Yes	Learning rate for the optimizer
args.batch_size	int	Yes	Training batch size per device
args.weight_decay	float	No	Weight decay coefficient (default: 0.01)
args.warmup	float	No	Warmup proportion (default: 0.01)
args.num_train_epochs	int	No	Number of training epochs (default: 3)
args.output_dir	str	Yes	Directory for saving checkpoints and tokenizer
args.seed	int	No	Random seed for reproducibility (default: 42)
args.local_rank	int	No	Local rank for distributed training (default: -1)
example	dict	Yes (for preprocess_alpaca)	Alpaca dataset example with 'instruction', 'input', 'output' keys
tokenizer	AutoTokenizer	Yes (for preprocess_alpaca)	Tokenizer instance for encoding text

Outputs

Name	Type	Description
tokenized	dict	Dictionary with 'input_ids', 'attention_mask', and 'labels' keys from preprocess_alpaca
checkpoint	directory	DeepSpeed checkpoint saved to args.output_dir on rank 0
tokenizer files	directory	Saved tokenizer files in args.output_dir on rank 0

Usage Examples

# Command-line launch with DeepSpeed
# deepspeed finetune_llama.py \
#     --model_name meta-llama/Llama-2-7b-hf \
#     --lr 2e-5 \
#     --batch_size 4 \
#     --num_train_epochs 3 \
#     --output_dir ./output \
#     --deepspeed ds_config.json

# Programmatic usage of preprocessing
from transformers import AutoTokenizer
from finetune_llama import preprocess_alpaca

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
example = {
    "instruction": "Summarize the following text.",
    "input": "DeepSpeed is a deep learning optimization library.",
    "output": "DeepSpeed optimizes deep learning training."
}
tokenized = preprocess_alpaca(example, tokenizer, max_length=512)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment