Environment:Axolotl ai cloud Axolotl Python Runtime

Knowledge Sources	Axolotl PyPI Setup
Domains	Infrastructure, Deep_Learning
Last Updated	2026-02-06 22:33 GMT

Overview

Python 3.10+ runtime environment with PyTorch >= 2.4, HuggingFace Transformers 5.0, Accelerate 1.12, and PEFT >= 0.18.1 for LLM fine-tuning.

Description

This environment defines the base Python software stack required to run any Axolotl training workflow. It is built around PyTorch as the deep learning framework, with HuggingFace Transformers providing the model architecture layer, Accelerate handling distributed training orchestration, and PEFT enabling parameter-efficient fine-tuning methods like LoRA and QLoRA. The runtime automatically detects the installed PyTorch version and pins compatible dependency versions for xformers, vLLM, and fbgemm-gpu.

Usage

Use this environment for all Axolotl workflows. Every training configuration, dataset preparation, and model loading operation depends on this base runtime. It is the mandatory prerequisite for all Implementation pages in this wiki.

System Requirements

Category	Requirement	Notes
OS	Linux (Ubuntu recommended)	macOS supported with reduced functionality (no bitsandbytes, triton, xformers)
Python	>= 3.10	Defined in pyproject.toml
Architecture	x86_64 preferred	aarch64 excludes xformers; Darwin excludes CUDA-only packages
Disk	10GB+ SSD	For package installation and dataset caching

Dependencies

System Packages

`python` >= 3.10
`git`
`git-lfs` (for HuggingFace model downloads)

Python Packages (Core)

`torch` >= 2.4.0 (default 2.8.0; dynamically pinned to installed version)
`transformers` == 5.0.0
`accelerate` == 1.12.0
`peft` >= 0.18.1
`datasets` == 4.5.0
`trl` == 0.27.1
`tokenizers` >= 0.22.1
`huggingface_hub` >= 1.1.7
`pydantic` >= 2.10.6
`PyYAML` >= 6.0
`packaging` == 26.0
`numpy` >= 2.2.6
`numba` >= 0.61.2
`scipy`
`einops`
`wandb`
`tensorboard`
`sentencepiece`

Python Packages (Evaluation)

`evaluate` == 0.4.1
`lm_eval` == 0.4.7

Python Packages (Remote Filesystems)

`s3fs` >= 2024.5.0 (for S3 paths)
`gcsfs` >= 2025.3.0 (for GCS paths)
`adlfs` >= 2024.5.0 (for Azure paths)
`ocifs` == 1.3.2 (for OCI paths)

Python Packages (Platform-Specific, Linux Only)

`bitsandbytes` == 0.49.1 (skipped on macOS)
`triton` >= 3.0.0 (skipped on macOS)
`xformers` >= 0.0.23.post1 (skipped on macOS and aarch64; version pinned per torch version)
`liger-kernel` == 0.6.4 (skipped on macOS)
`mamba-ssm` == 1.2.0.post1 (skipped on macOS; optional extra)

Optional Extras (pip install axolotl[extra])

`flash-attn`: flash-attn == 2.8.3
`ring-flash-attn`: flash-attn == 2.8.3 + ring-flash-attn >= 0.1.7
`deepspeed`: deepspeed == 0.18.2 + deepspeed-kernels
`mamba-ssm`: mamba-ssm == 1.2.0.post1 + causal_conv1d
`auto-gptq`: auto-gptq == 0.5.1
`optimizers`: galore_torch, apollo-torch, lomo-optim, torch-optimi, came_pytorch
`vllm`: version depends on torch (0.10.0 to 0.14.0)
`llmcompressor`: llmcompressor == 0.5.1
`ray`: ray[train] >= 2.52.1
`opentelemetry`: opentelemetry-api, opentelemetry-sdk, opentelemetry-exporter-prometheus

Credentials

The following environment variables may be needed at runtime:

`HF_TOKEN`: HuggingFace API token for gated model access (e.g., Llama, Mistral).
`HF_HUB_OFFLINE`: Set to "1" to run in offline mode without network access.
`WANDB_API_KEY`: Weights & Biases API key for experiment logging.
`TOKENIZERS_PARALLELISM`: Set to "false" automatically to suppress tokenizer warnings.
`AXOLOTL_DO_NOT_TRACK`: Set to "1" to disable telemetry.
`AXOLOTL_LOG_LEVEL`: Log level (default "INFO").
`AXOLOTL_DATASET_NUM_PROC`: Number of dataset processing workers.
`AXOLOTL_TEE_STDOUT`: Tee stdout to files (default "1").
`AXOLOTL_METRIC_NDIGITS`: Number of digits for metric rounding (default "5").

Quick Install

# Install core Axolotl with all dependencies
pip install axolotl

# Or install from source
git clone https://github.com/axolotl-ai-cloud/axolotl.git
cd axolotl
pip install -e .

# With optional extras
pip install -e ".[flash-attn,deepspeed]"

Code Evidence

Python version requirement from `pyproject.toml:10`:

requires-python = ">=3.10"

Torch version validation from `setup.py:128`:

raise ValueError("axolotl requires torch>=2.4")

Platform-specific package exclusion from `setup.py:29-42`:

if "Darwin" in platform.system():
    skip_packages = [
        "bitsandbytes",
        "triton",
        "mamba-ssm",
        "xformers",
        "liger-kernel",
    ]

Tokenizer parallelism silencing from `src/axolotl/cli/__init__.py:7`:

os.environ["TOKENIZERS_PARALLELISM"] = "false"

CUDA memory allocator auto-configuration from `src/axolotl/utils/__init__.py:50-63`:

if torch_major == 2 and torch_minor >= 9 and os.getenv("PYTORCH_ALLOC_CONF") is None:
    os.environ["PYTORCH_ALLOC_CONF"] = "expandable_segments:True,roundup_power2_divisions:16"
elif torch_major == 2 and torch_minor >= 2 and os.getenv("PYTORCH_CUDA_ALLOC_CONF") is None:
    os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True,roundup_power2_divisions:16"

Common Errors

Error Message	Cause	Solution
`ValueError: axolotl requires torch>=2.4`	PyTorch version too old	`pip install torch>=2.4`
`ImportError: s3:// paths require s3fs to be installed`	Missing cloud filesystem package	`pip install s3fs` (or gcsfs, adlfs, ocifs for other cloud providers)
`ImportError: Please run pip uninstall fla-core...`	Missing Flash Linear Attention for Kimi models	Follow the error message to install fla from source
`ModuleNotFoundError: No module named 'bitsandbytes'`	Running on macOS where bitsandbytes is excluded	Use Linux or remove quantization config options

Compatibility Notes

macOS (Darwin): bitsandbytes, triton, mamba-ssm, xformers, and liger-kernel are excluded. Only CPU training and MPS device supported.
aarch64 (ARM64): xformers is excluded. All other packages install normally.
Torch version pinning: The setup.py dynamically detects installed torch version and pins xformers, vLLM, and fbgemm-gpu to compatible versions. Manual torch upgrades may break these dependencies.
xformers version matrix: torch 2.4 uses xformers 0.0.27-0.0.28; torch 2.5 uses 0.0.28.post2+; torch 2.6 uses 0.0.29.post3; torch 2.7 uses 0.0.30-0.0.31; torch 2.8+ uses the version from requirements.txt.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment