Principle:Microsoft BIPIA Distributed Finetuning
| Field | Value |
|---|---|
| Sources | BIPIA paper, DeepSpeed ZeRO paper |
| Domains | NLP, Distributed_Training, Defense |
| Last Updated | 2026-02-14 |
Overview
A distributed supervised finetuning methodology that trains LLMs to ignore indirect prompt injection attacks using DeepSpeed ZeRO Stage 3 for memory-efficient multi-GPU training.
Description
The finetuning process takes the prepared model (with special / tokens) and the tokenized training dataset (with label masking), then trains using HuggingFace's Trainer with DeepSpeed ZeRO Stage 3 optimization. ZeRO Stage 3 partitions model parameters, gradients, and optimizer states across GPUs, enabling finetuning of models larger than single-GPU memory. The trained model learns to attend to the / boundary markers and ignore injected content within them. Training uses AdamW optimizer with bf16 mixed precision.
Usage
Use when finetuning 7B-13B+ parameter LLMs for white-box defense. Requires multiple GPUs and DeepSpeed configuration.
Theoretical Basis
Supervised finetuning minimizes cross-entropy loss on response tokens only (due to label masking). DeepSpeed ZeRO Stage 3 partitions memory as follows:
P_gpu = (params + grads + optimizer_states) / N_gpus
The config uses bf16 for computation and fp32 for optimizer states (Adam). Gradient accumulation combines micro-batches to achieve effective larger batch sizes.