Principle:Microsoft BIPIA Distributed Finetuning

Field	Value
Sources	BIPIA paper, DeepSpeed ZeRO paper
Domains	NLP, Distributed_Training, Defense
Last Updated	2026-02-14

Overview

A distributed supervised finetuning methodology that trains LLMs to ignore indirect prompt injection attacks using DeepSpeed ZeRO Stage 3 for memory-efficient multi-GPU training.

Description

The finetuning process takes the prepared model (with special / tokens) and the tokenized training dataset (with label masking), then trains using HuggingFace's Trainer with DeepSpeed ZeRO Stage 3 optimization. ZeRO Stage 3 partitions model parameters, gradients, and optimizer states across GPUs, enabling finetuning of models larger than single-GPU memory. The trained model learns to attend to the / boundary markers and ignore injected content within them. Training uses AdamW optimizer with bf16 mixed precision.

Usage

Use when finetuning 7B-13B+ parameter LLMs for white-box defense. Requires multiple GPUs and DeepSpeed configuration.

Theoretical Basis

Supervised finetuning minimizes cross-entropy loss on response tokens only (due to label masking). DeepSpeed ZeRO Stage 3 partitions memory as follows:

P_gpu = (params + grads + optimizer_states) / N_gpus

The config uses bf16 for computation and fp32 for optimizer states (Adam). Gradient accumulation combines micro-batches to achieve effective larger batch sizes.

Related Pages

Implementation:Microsoft_BIPIA_HF_Trainer_For_Defense

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment