Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Facebookresearch Audiocraft XFormers Memory Efficient Attention

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Optimization, Deep_Learning
Last Updated 2026-02-13 23:00 GMT

Overview

Optional xformers dependency (< 0.0.23) for memory-efficient attention and gradient checkpointing in AudioCraft's transformer modules.

Description

AudioCraft supports two attention backends: PyTorch native (torch.nn.functional.scaled_dot_product_attention) and xformers (xformers.ops.memory_efficient_attention). The backend is selected globally via set_efficient_attention_backend(). While PyTorch native attention is the default, xformers is required for:

  • MAGNeT models: The MAGNeT loader explicitly sets the attention backend to xformers when memory_efficient=True.
  • Gradient checkpointing: The xformers_default and xformers_mm checkpointing strategies require the xformers.checkpoint_fairinternal module.
  • Custom attention masks: xformers supports efficient custom attention masks via LowerTriangularMask.

The xformers package requires CUDA compilation with matching architecture flags (e.g., TORCH_CUDA_ARCH_LIST='8.0' for Ampere GPUs).

Usage

Use this environment when running MAGNeT models with memory-efficient attention, or when enabling gradient checkpointing with xformers strategies in the StreamingTransformer. Also required when training large transformer models where PyTorch native attention runs out of memory.

System Requirements

Category Requirement Notes
Hardware NVIDIA GPU with CUDA Required for xformers compilation
CUDA Architecture Compute capability 6.0+ Set via TORCH_CUDA_ARCH_LIST env var during build
PyTorch 2.1.0 Must match xformers version

Dependencies

Python Packages

  • xformers < 0.0.23 (from requirements.txt)
  • CI pins xformers == 0.0.22.post7

Credentials

No credentials required.

Quick Install

# Standard install (pre-built wheel)
pip install xformers==0.0.22.post7

# Build from source (if pre-built wheel unavailable)
FORCE_CUDA=1 TORCH_CUDA_ARCH_LIST='8.0' \
  pip install -U git+https://github.com/facebookresearch/xformers.git#egg=xformers

Code Evidence

Attention backend selection from audiocraft/modules/transformer.py:31-35:

def set_efficient_attention_backend(backend: str = 'torch'):
    global _efficient_attention_backend
    assert _efficient_attention_backend in ['xformers', 'torch']
    _efficient_attention_backend = backend

xformers import verification from audiocraft/modules/transformer.py:727-737:

def _verify_xformers_memory_efficient_compat():
    try:
        from xformers.ops import memory_efficient_attention, LowerTriangularMask
    except ImportError:
        raise ImportError(
            "xformers is not installed. Please install it and try again.\n"
            "To install on AWS and Azure, run \n"
            "FORCE_CUDA=1 TORCH_CUDA_ARCH_LIST='8.0'\\\n"
            "pip install -U git+https://git@github.com/fairinternal/xformers.git#egg=xformers\n"
        )

xformers gradient checkpointing verification from audiocraft/modules/transformer.py:741-751:

def _verify_xformers_internal_compat():
    try:
        from xformers.checkpoint_fairinternal import checkpoint, _get_default_policy
    except ImportError:
        raise ImportError(
            "Francisco's fairinternal xformers is not installed..."
            "FORCE_CUDA=1 TORCH_CUDA_ARCH_LIST='6.0;7.0'\\\n"
        )

Backend-specific attention dispatch from audiocraft/modules/transformer.py:402-416:

if self.memory_efficient:
    p = self.dropout if self.training else 0
    if _efficient_attention_backend == 'torch':
        x = torch.nn.functional.scaled_dot_product_attention(
            q, k, v, is_causal=attn_mask is not None, dropout_p=p)
    else:
        x = ops.memory_efficient_attention(q, k, v, attn_mask, p=p)

MAGNeT forcing xformers backend from audiocraft/models/loaders.py:148-149:

if cfg.transformer_lm.memory_efficient:
    set_efficient_attention_backend("xformers")

Common Errors

Error Message Cause Solution
ImportError: xformers is not installed xformers package missing pip install xformers==0.0.22.post7
ImportError: Francisco's fairinternal xformers is not installed Using xformers checkpointing without internal build Build xformers from source with FORCE_CUDA=1
CUDA error: no kernel image is available xformers compiled for wrong GPU architecture Rebuild with correct TORCH_CUDA_ARCH_LIST (e.g., '8.0' for A100)

Compatibility Notes

  • PyTorch native attention (default): Works without xformers; uses torch.nn.functional.scaled_dot_product_attention.
  • MAGNeT models: Explicitly require xformers backend when memory_efficient=True is set in config.
  • Gradient checkpointing: torch strategy uses PyTorch native; xformers_default and xformers_mm require xformers internal module.
  • Tensor layout: xformers uses time dimension at index 1; PyTorch native uses index 2 when memory_efficient=True.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment