Environment:Alibaba ROLL Python Runtime Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Runtime |
| Last Updated | 2026-02-07 19:00 GMT |
Overview
Python 3.10+ runtime environment with Ray 2.48.0 distributed framework, Hydra configuration, and core ML/NLP dependencies for all ROLL pipelines.
Description
This environment defines the core Python runtime and common dependencies shared by all ROLL pipelines regardless of GPU platform or training/inference backend. The framework requires Python 3.10+ (as specified in `setup.py`) and relies on Ray for distributed actor management, Hydra/OmegaConf for hierarchical YAML configuration, and HuggingFace ecosystem packages (transformers, datasets, peft, trl, accelerate) for model and data handling. Additional dependencies include math verification (math-verify, latex2sympy2), RL environments (gym, gymnasium, gym_sokoban), and experiment tracking (wandb, swanlab).
Usage
This environment is a prerequisite for all ROLL pipelines. Every pipeline (RLVR, Agentic, DPO, SFT, Distillation, Reward FL) requires these base dependencies. GPU-specific and backend-specific packages are layered on top.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Python | >= 3.10 | 3.11 required for Ascend NPU |
| OS | Linux | macOS for local debugging only |
| RAM | 16GB+ | More for large datasets |
Dependencies
Python Packages (Core)
- `ray[default,cgraph]` == 2.48.0
- `numpy` >= 1.25, < 2.0
- `tensordict` (any recent version)
- `hydra-core` + `omegaconf`
- `pydantic`
- `dacite`
- `tqdm`
- `einops`
- `deprecated`
Python Packages (ML/NLP)
- `transformers` (version depends on backend; 4.51.1 for SGLang, 4.57.0 for torch 2.8.0)
- `datasets` == 3.1.0
- `peft` == 0.12.0
- `trl` == 0.9.6
- `accelerate` == 0.34.2
- `modelscope`
- `loralib`
- `sympy`
Python Packages (Reward/Eval)
- `math-verify`
- `openai`
- `langdetect`
- `nltk` >= 3.8
- `latex2sympy2` == 1.5.4
- `latex2sympy2_extended` == 1.10.1
- `antlr4-python3-runtime` == 4.9.3
Python Packages (RL Environments)
- `gym`
- `gymnasium[toy-text]`
- `gym_sokoban`
- `gem-llm` == 0.0.4
- `mcp`
Python Packages (Tracking)
- `wandb`
- `swanlab`
Credentials
- `MODEL_DOWNLOAD_TYPE`: Set to `HUGGINGFACE_HUB` or `MODELSCOPE` for model downloads
- `TOKENIZERS_PARALLELISM`: Set to `false` in agentic pipeline to avoid deadlocks
- `RAY_DEDUP_LOGS`: Log deduplication (default `1`, enabled)
- `roll_RPC_TIMEOUT`: Configurable RPC timeout in seconds
Quick Install
# Install ROLL package
pip install -e .
# Install common dependencies
pip install -r requirements_common.txt
# For vision models, also install:
pip install -r requirements_vision.txt
Code Evidence
Python version requirement from `setup.py:8`:
python_requires=">=3.10",
Ray log deduplication from `roll/__init__.py:3`:
os.environ["RAY_DEDUP_LOGS"] = os.getenv("RAY_DEDUP_LOGS", "1")
Tokenizer parallelism disabled from `roll/pipeline/agentic/utils.py:43`:
os.environ["TOKENIZERS_PARALLELISM"] = "false"
TensorDict version check from `roll/distributed/scheduler/protocol.py:229`:
if tensordict.__version__ >= "0.5.0" and self.batch is not None:
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ModuleNotFoundError: No module named 'roll'` | ROLL not installed | Run `pip install -e .` from repo root |
| `assert self.lr_decay_steps > 0` | Batch/worker configuration mismatch | Adjust `rollout_batch_size`, `gradient_accumulation_steps`, or DP size |
| Ray timeout errors | RPC timeout too short for workload | Set `roll_RPC_TIMEOUT` environment variable to higher value |
Compatibility Notes
- Python 3.10: Minimum required. TaskGroup features require 3.11+.
- numpy: Must be < 2.0 for compatibility.
- antlr4: Pinned to 4.9.3 for latex2sympy2 compatibility.
- Ray: Pinned to 2.48.0 with cgraph support required.
Related Pages
- Implementation:Alibaba_ROLL_RLVRConfig
- Implementation:Alibaba_ROLL_AgenticConfig
- Implementation:Alibaba_ROLL_DPOConfig
- Implementation:Alibaba_ROLL_SFTConfig
- Implementation:Alibaba_ROLL_DistillConfig
- Implementation:Alibaba_ROLL_RewardFLConfig
- Implementation:Alibaba_ROLL_Cluster
- Implementation:Alibaba_ROLL_RolloutScheduler