Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Alibaba ROLL Python Runtime Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Runtime
Last Updated 2026-02-07 19:00 GMT

Overview

Python 3.10+ runtime environment with Ray 2.48.0 distributed framework, Hydra configuration, and core ML/NLP dependencies for all ROLL pipelines.

Description

This environment defines the core Python runtime and common dependencies shared by all ROLL pipelines regardless of GPU platform or training/inference backend. The framework requires Python 3.10+ (as specified in `setup.py`) and relies on Ray for distributed actor management, Hydra/OmegaConf for hierarchical YAML configuration, and HuggingFace ecosystem packages (transformers, datasets, peft, trl, accelerate) for model and data handling. Additional dependencies include math verification (math-verify, latex2sympy2), RL environments (gym, gymnasium, gym_sokoban), and experiment tracking (wandb, swanlab).

Usage

This environment is a prerequisite for all ROLL pipelines. Every pipeline (RLVR, Agentic, DPO, SFT, Distillation, Reward FL) requires these base dependencies. GPU-specific and backend-specific packages are layered on top.

System Requirements

Category Requirement Notes
Python >= 3.10 3.11 required for Ascend NPU
OS Linux macOS for local debugging only
RAM 16GB+ More for large datasets

Dependencies

Python Packages (Core)

  • `ray[default,cgraph]` == 2.48.0
  • `numpy` >= 1.25, < 2.0
  • `tensordict` (any recent version)
  • `hydra-core` + `omegaconf`
  • `pydantic`
  • `dacite`
  • `tqdm`
  • `einops`
  • `deprecated`

Python Packages (ML/NLP)

  • `transformers` (version depends on backend; 4.51.1 for SGLang, 4.57.0 for torch 2.8.0)
  • `datasets` == 3.1.0
  • `peft` == 0.12.0
  • `trl` == 0.9.6
  • `accelerate` == 0.34.2
  • `modelscope`
  • `loralib`
  • `sympy`

Python Packages (Reward/Eval)

  • `math-verify`
  • `openai`
  • `langdetect`
  • `nltk` >= 3.8
  • `latex2sympy2` == 1.5.4
  • `latex2sympy2_extended` == 1.10.1
  • `antlr4-python3-runtime` == 4.9.3

Python Packages (RL Environments)

  • `gym`
  • `gymnasium[toy-text]`
  • `gym_sokoban`
  • `gem-llm` == 0.0.4
  • `mcp`

Python Packages (Tracking)

  • `wandb`
  • `swanlab`

Credentials

  • `MODEL_DOWNLOAD_TYPE`: Set to `HUGGINGFACE_HUB` or `MODELSCOPE` for model downloads
  • `TOKENIZERS_PARALLELISM`: Set to `false` in agentic pipeline to avoid deadlocks
  • `RAY_DEDUP_LOGS`: Log deduplication (default `1`, enabled)
  • `roll_RPC_TIMEOUT`: Configurable RPC timeout in seconds

Quick Install

# Install ROLL package
pip install -e .

# Install common dependencies
pip install -r requirements_common.txt

# For vision models, also install:
pip install -r requirements_vision.txt

Code Evidence

Python version requirement from `setup.py:8`:

python_requires=">=3.10",

Ray log deduplication from `roll/__init__.py:3`:

os.environ["RAY_DEDUP_LOGS"] = os.getenv("RAY_DEDUP_LOGS", "1")

Tokenizer parallelism disabled from `roll/pipeline/agentic/utils.py:43`:

os.environ["TOKENIZERS_PARALLELISM"] = "false"

TensorDict version check from `roll/distributed/scheduler/protocol.py:229`:

if tensordict.__version__ >= "0.5.0" and self.batch is not None:

Common Errors

Error Message Cause Solution
`ModuleNotFoundError: No module named 'roll'` ROLL not installed Run `pip install -e .` from repo root
`assert self.lr_decay_steps > 0` Batch/worker configuration mismatch Adjust `rollout_batch_size`, `gradient_accumulation_steps`, or DP size
Ray timeout errors RPC timeout too short for workload Set `roll_RPC_TIMEOUT` environment variable to higher value

Compatibility Notes

  • Python 3.10: Minimum required. TaskGroup features require 3.11+.
  • numpy: Must be < 2.0 for compatibility.
  • antlr4: Pinned to 4.9.3 for latex2sympy2 compatibility.
  • Ray: Pinned to 2.48.0 with cgraph support required.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment