Environment:Bigscience workshop Petals Python Hivemind

Knowledge Sources	Petals setup.cfg
Domains	Infrastructure, Distributed_Computing
Last Updated	2026-02-09 13:00 GMT

Overview

Python >= 3.8 client environment with hivemind, PyTorch, and petals for distributed inference and training.

Description

This environment provides the core Python runtime required for all client-side interactions with the Petals distributed network. It combines several key components: hivemind supplies the peer-to-peer networking layer that connects clients to remote servers hosting model shards; PyTorch handles local tensor computation, autograd, and GPU acceleration when available; transformers manages model configuration loading, tokenizer instantiation, and generation utilities; and petals itself orchestrates distributed inference by routing forward and backward passes through sequences of remote transformer blocks. Together these packages form the minimum viable environment needed to connect to a Petals swarm, load a distributed large language model, run inference sessions, perform autoregressive text generation, and execute prompt tuning or other fine-tuning workflows across the decentralized network.

Usage

This environment is required for all client-side operations within the Petals ecosystem. This includes loading distributed model classes such as AutoDistributedModelForCausalLM and architecture-specific variants like DistributedLlamaForCausalLM or DistributedBloomForCausalLM; tokenizing inputs using HuggingFace tokenizers; opening inference sessions via InferenceSession for stepped generation; running full autoregressive generation through RemoteGenerationMixin.generate(); performing prompt tuning with PTuneMixin; executing distributed autograd through RemoteSequentialAutogradFunction for training and fine-tuning; and running distributed evaluation loops. Any script or application that communicates with the Petals swarm must have this environment installed.

System Requirements

Category	Requirement	Notes
OS	Linux, macOS, Windows (WSL2)	macOS requires special fork safety env vars
Hardware	Any CPU (GPU optional for client)	Client runs on CPU; servers need GPU
Network	Internet access	Required for P2P swarm connectivity
Python	>= 3.8	Supports 3.8, 3.9, 3.10, 3.11

Dependencies

System Packages

No special system packages required for client-only usage

Python Packages

petals (installs all below)
torch >= 1.12
hivemind @ git+https://github.com/learning-at-home/hivemind.git@213bff98a62accb91f254e2afdccbf1d69ebdea9
transformers == 4.43.1
huggingface-hub >= 0.11.1, < 1.0.0
tokenizers >= 0.13.3
accelerate >= 0.27.2
bitsandbytes == 0.41.1
tensor_parallel == 1.0.23
peft == 0.8.2
safetensors >= 0.3.1
sentencepiece >= 0.1.99
packaging >= 20.9
async-timeout >= 4.0.2
humanfriendly
Dijkstar >= 2.6.0
numpy < 2

Credentials

HF_TOKEN (optional): HuggingFace API token for accessing gated models (e.g., Llama-2). Required when loading meta-llama/Llama-2-* models.

Quick Install

pip install petals

Code Evidence

Transformers version pinning from src/petals/__init__.py:23-26:

if not os.getenv("PETALS_IGNORE_DEPENDENCY_VERSION"):
    assert (
        version.parse("4.43.1") <= version.parse(transformers.__version__) < version.parse("4.44.0")
    ), "Please install a proper transformers version: pip install transformers>=4.43.1,<4.44.0"

macOS fork safety from src/petals/__init__.py:6-9:

if platform.system() == "Darwin":
    os.environ.setdefault("no_proxy", "*")
    os.environ.setdefault("OBJC_DISABLE_INITIALIZE_FORK_SAFETY", "YES")

HuggingFace auth check from src/petals/utils/hf_auth.py:5-7:

def always_needs_auth(model_name: Union[str, os.PathLike, None]) -> bool:
    loading_from_repo = model_name is not None and not os.path.isdir(model_name)
    return loading_from_repo and model_name.startswith("meta-llama/Llama-2-")

Python version and dependencies from setup.cfg:33-52:

python_requires = >=3.8
install_requires =
    torch>=1.12
    bitsandbytes==0.41.1
    accelerate>=0.27.2
    huggingface-hub>=0.11.1,<1.0.0
    tokenizers>=0.13.3
    transformers==4.43.1
    hivemind @ git+https://github.com/learning-at-home/hivemind.git@...
    tensor_parallel==1.0.23
    peft==0.8.2
    safetensors>=0.3.1
    sentencepiece>=0.1.99
    packaging>=20.9
    async-timeout>=4.0.2
    numpy<2

Common Errors

Error Message	Cause	Solution
`AssertionError: Please install a proper transformers version`	Transformers version outside 4.43.1-4.43.x range	`pip install transformers==4.43.1`
`OSError: meta-llama/Llama-2-* requires authentication`	Gated model needs HF token	Set `HF_TOKEN` env var or pass `token=True`
`ImportError: hivemind not found`	hivemind not installed from correct commit	`pip install petals` (installs pinned hivemind)

Compatibility Notes

macOS: Automatically sets OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES and no_proxy=* to prevent fork-related crashes.
PETALS_IGNORE_DEPENDENCY_VERSION: Set this env var to bypass the strict transformers version check.
USE_LEGACY_BFLOAT16: Controls bfloat16 serialization mode in hivemind; Petals defaults to modern mode.
bitsandbytes: The BITSANDBYTES_NOWELCOME env var is set automatically to suppress the welcome message.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment