Implementation:Huggingface Diffusers Accelerator Setup
| Knowledge Sources | |
|---|---|
| Domains | Diffusion_Models, Distributed_Training, Mixed_Precision |
| Last Updated | 2026-02-13 21:00 GMT |
Overview
Concrete tool for initializing a distributed training environment provided by Hugging Face Accelerate, as used in the Diffusers LoRA training examples.
Description
The Accelerator class from the accelerate library is the entry point for all distributed training functionality in Diffusers training scripts. It wraps PyTorch's distributed primitives (DDP, FSDP) and provides a unified interface for mixed precision, gradient accumulation, logging, and device placement. In the LoRA fine-tuning script, the Accelerator is initialized with a ProjectConfiguration that specifies output and logging directories, and then used throughout training to coordinate multi-process behavior.
The setup pattern also includes configuring logging verbosity per process (only the main process logs at INFO level), setting the random seed for reproducibility, creating output directories, and optionally creating a Hugging Face Hub repository for pushing results.
Usage
Use this pattern at the beginning of any Diffusers training script to:
- Initialize distributed training across available GPUs
- Configure mixed precision (fp16, bf16, or fp32)
- Set up gradient accumulation
- Configure experiment tracking (TensorBoard, W&B, etc.)
- Ensure reproducible training with seed setting
Code Reference
Source Location
- Repository: diffusers
- File:
examples/text_to_image/train_text_to_image_lora.py - Lines: 467-505
Signature
accelerator = Accelerator(
gradient_accumulation_steps=args.gradient_accumulation_steps,
mixed_precision=args.mixed_precision,
log_with=args.report_to,
project_config=accelerator_project_config,
)
Import
from accelerate import Accelerator
from accelerate.utils import ProjectConfiguration, set_seed
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| gradient_accumulation_steps | int |
No | Number of micro-batches to accumulate before performing an optimizer step. Defaults to 1 (no accumulation). |
| mixed_precision | str |
No | Precision mode: "no", "fp16", or "bf16". Controls whether the forward pass uses half-precision.
|
| log_with | str or list[str] |
No | Experiment tracker(s) to use. Supports "tensorboard", "wandb", "comet_ml", or a list of these.
|
| project_config | ProjectConfiguration |
No | Configuration object specifying project_dir (output directory) and logging_dir.
|
Outputs
| Name | Type | Description |
|---|---|---|
| accelerator | Accelerator |
Configured accelerator instance that manages distributed training state, device placement, mixed precision, and logging. |
Usage Examples
Basic Usage
from accelerate import Accelerator
from accelerate.utils import ProjectConfiguration, set_seed
from pathlib import Path
# Configure project directories
output_dir = "./output"
logging_dir = Path(output_dir, "logs")
project_config = ProjectConfiguration(
project_dir=output_dir,
logging_dir=logging_dir,
)
# Initialize the accelerator
accelerator = Accelerator(
gradient_accumulation_steps=4,
mixed_precision="fp16",
log_with="tensorboard",
project_config=project_config,
)
# Set seed for reproducibility
set_seed(42)
# Use accelerator properties throughout training
print(f"Device: {accelerator.device}")
print(f"Num processes: {accelerator.num_processes}")
print(f"Is main process: {accelerator.is_main_process}")
Logging Configuration Pattern
import logging
import diffusers
import transformers
import datasets
logging.basicConfig(
format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
datefmt="%m/%d/%Y %H:%M:%S",
level=logging.INFO,
)
# Only the main process logs at INFO; others log at ERROR
if accelerator.is_local_main_process:
datasets.utils.logging.set_verbosity_warning()
transformers.utils.logging.set_verbosity_warning()
diffusers.utils.logging.set_verbosity_info()
else:
datasets.utils.logging.set_verbosity_error()
transformers.utils.logging.set_verbosity_error()
diffusers.utils.logging.set_verbosity_error()