Environment:AUTOMATIC1111 Stable diffusion webui Xformers Attention
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Optimization |
| Last Updated | 2026-02-08 08:00 GMT |
Overview
Optional xformers memory-efficient attention library providing the highest-priority cross-attention optimization for NVIDIA GPUs with compute capability 6.0-9.0.
Description
Xformers is an optional dependency that provides memory-efficient attention kernels. When available and enabled, it replaces the default cross-attention forward methods in both the LDM and SGM model codebases via monkey-patching. It has the highest optimization priority (100) among all attention optimizers, meaning it will be chosen automatically when available. The library requires NVIDIA CUDA GPUs with compute capability between 6.0 and 9.0 (inclusive), covering Pascal through Ada Lovelace architectures.
Usage
Enable xformers for faster and more memory-efficient image generation on compatible NVIDIA GPUs. It is particularly beneficial for high-resolution generation and when VRAM is limited. Use the `--xformers` command-line flag to enable it, or `--force-enable-xformers` to bypass compatibility checks.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| GPU | NVIDIA with CUDA compute capability 6.0-9.0 | Pascal (GTX 10xx) through Ada Lovelace (RTX 40xx) |
| CUDA | Compatible with PyTorch CUDA version | Must match the PyTorch CUDA build |
| OS | Linux or Windows | Not available on macOS (MPS) |
Dependencies
Python Packages
- `xformers` == 0.0.23.post1
Credentials
- `XFORMERS_PACKAGE`: Override xformers package version (default: xformers==0.0.23.post1)
Quick Install
# Install xformers (normally handled by launch.py with --xformers flag)
pip install xformers==0.0.23.post1
# Or launch with auto-install
python launch.py --xformers
Code Evidence
Xformers import and availability check from `modules/sd_hijack_optimizations.py:158-163`:
if shared.cmd_opts.xformers or shared.cmd_opts.force_enable_xformers:
try:
import xformers.ops
shared.xformers_available = True
except Exception:
errors.report("Cannot import xformers", exc_info=True)
GPU capability check from `modules/sd_hijack_optimizations.py:56-57`:
def is_available(self):
return shared.cmd_opts.force_enable_xformers or (
shared.xformers_available and torch.cuda.is_available()
and (6, 0) <= torch.cuda.get_device_capability(shared.device) <= (9, 0)
)
Xformers version check from `modules/errors.py:125-135`:
if shared.xformers_available:
import xformers
if version.parse(xformers.__version__) < version.parse(expected_xformers_version):
print_error_explanation(f"""
You are running xformers {xformers.__version__}.
The program is tested to work with xformers {expected_xformers_version}.
To reinstall the desired version, run with commandline flag --reinstall-xformers.
""")
Auto-install logic from `modules/launch_utils.py:401-403`:
if (not is_installed("xformers") or args.reinstall_xformers) and args.xformers:
run_pip(f"install -U -I --no-deps {xformers_package}", "xformers")
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `Cannot import xformers` | xformers not installed or incompatible | Use `--reinstall-xformers` flag; verify CUDA version compatibility |
| `You are running xformers X.Y.Z` | Version mismatch | Use `--reinstall-xformers` to install correct version |
| xformers not selected despite being installed | GPU compute capability outside 6.0-9.0 range | Use `--force-enable-xformers` to override check |
Compatibility Notes
- Priority system: Xformers has priority 100 (highest). Falls back to SDP-no-mem (80), SDP (70), Doggettx split attention (90), sub-quadratic (10 on CUDA, 1000 on MPS), InvokeAI, or V1 split.
- SDP alternative: PyTorch 2.0+ includes `scaled_dot_product_attention` which provides similar benefits without xformers. Use `--opt-sdp-attention` or `--opt-sdp-no-mem-attention`.
- Flash Attention: xformers Flash Attention is available via `--xformers-flash-attention` flag for improved reproducibility (SD2.x models only).
- Not available on MPS: macOS uses sub-quadratic attention as the default high-priority optimizer instead.