Environment:Alibaba ROLL Python Runtime Environment

Knowledge Sources	Alibaba ROLL
Domains	Infrastructure, Runtime
Last Updated	2026-02-07 19:00 GMT

Overview

Python 3.10+ runtime environment with Ray 2.48.0 distributed framework, Hydra configuration, and core ML/NLP dependencies for all ROLL pipelines.

Description

This environment defines the core Python runtime and common dependencies shared by all ROLL pipelines regardless of GPU platform or training/inference backend. The framework requires Python 3.10+ (as specified in `setup.py`) and relies on Ray for distributed actor management, Hydra/OmegaConf for hierarchical YAML configuration, and HuggingFace ecosystem packages (transformers, datasets, peft, trl, accelerate) for model and data handling. Additional dependencies include math verification (math-verify, latex2sympy2), RL environments (gym, gymnasium, gym_sokoban), and experiment tracking (wandb, swanlab).

Usage

This environment is a prerequisite for all ROLL pipelines. Every pipeline (RLVR, Agentic, DPO, SFT, Distillation, Reward FL) requires these base dependencies. GPU-specific and backend-specific packages are layered on top.

System Requirements

Category	Requirement	Notes
Python	>= 3.10	3.11 required for Ascend NPU
OS	Linux	macOS for local debugging only
RAM	16GB+	More for large datasets

Dependencies

Python Packages (Core)

`ray[default,cgraph]` == 2.48.0
`numpy` >= 1.25, < 2.0
`tensordict` (any recent version)
`hydra-core` + `omegaconf`
`pydantic`
`dacite`
`tqdm`
`einops`
`deprecated`

Python Packages (ML/NLP)

`transformers` (version depends on backend; 4.51.1 for SGLang, 4.57.0 for torch 2.8.0)
`datasets` == 3.1.0
`peft` == 0.12.0
`trl` == 0.9.6
`accelerate` == 0.34.2
`modelscope`
`loralib`
`sympy`

Python Packages (Reward/Eval)

`math-verify`
`openai`
`langdetect`
`nltk` >= 3.8
`latex2sympy2` == 1.5.4
`latex2sympy2_extended` == 1.10.1
`antlr4-python3-runtime` == 4.9.3

Python Packages (RL Environments)

`gym`
`gymnasium[toy-text]`
`gym_sokoban`
`gem-llm` == 0.0.4
`mcp`

Python Packages (Tracking)

`wandb`
`swanlab`

Credentials

`MODEL_DOWNLOAD_TYPE`: Set to `HUGGINGFACE_HUB` or `MODELSCOPE` for model downloads
`TOKENIZERS_PARALLELISM`: Set to `false` in agentic pipeline to avoid deadlocks
`RAY_DEDUP_LOGS`: Log deduplication (default `1`, enabled)
`roll_RPC_TIMEOUT`: Configurable RPC timeout in seconds

Quick Install

# Install ROLL package
pip install -e .

# Install common dependencies
pip install -r requirements_common.txt

# For vision models, also install:
pip install -r requirements_vision.txt

Code Evidence

Python version requirement from `setup.py:8`:

python_requires=">=3.10",

Ray log deduplication from `roll/__init__.py:3`:

os.environ["RAY_DEDUP_LOGS"] = os.getenv("RAY_DEDUP_LOGS", "1")

Tokenizer parallelism disabled from `roll/pipeline/agentic/utils.py:43`:

os.environ["TOKENIZERS_PARALLELISM"] = "false"

TensorDict version check from `roll/distributed/scheduler/protocol.py:229`:

if tensordict.__version__ >= "0.5.0" and self.batch is not None:

Common Errors

Error Message	Cause	Solution
`ModuleNotFoundError: No module named 'roll'`	ROLL not installed	Run `pip install -e .` from repo root
`assert self.lr_decay_steps > 0`	Batch/worker configuration mismatch	Adjust `rollout_batch_size`, `gradient_accumulation_steps`, or DP size
Ray timeout errors	RPC timeout too short for workload	Set `roll_RPC_TIMEOUT` environment variable to higher value

Compatibility Notes

Python 3.10: Minimum required. TaskGroup features require 3.11+.
numpy: Must be < 2.0 for compatibility.
antlr4: Pinned to 4.9.3 for latex2sympy2 compatibility.
Ray: Pinned to 2.48.0 with cgraph support required.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment