Environment:Iamhankai Forest of Thought Python CUDA Runtime

Knowledge Sources	Forest-of-Thought PyTorch HuggingFace Transformers
Domains	Infrastructure, Deep_Learning, LLMs
Last Updated	2026-02-14 03:30 GMT

Overview

Linux environment with Python 3.10, CUDA >= 11.7, PyTorch 2.3.0, and HuggingFace Transformers 4.41.2 for running Forest-of-Thought LLM reasoning experiments.

Description

This environment provides the full GPU-accelerated runtime required by the Forest-of-Thought framework. The system loads large language models (LLaMA, Qwen, GLM, DeepSeek) using HuggingFace Transformers with automatic device mapping (device_map='auto') and mixed-precision inference (torch.float16 for pipeline mode, torch.bfloat16 for direct model loading). All inference is hardcoded to run on CUDA (self.device = "cuda"), making an NVIDIA GPU mandatory.

The framework also relies on SymPy for symbolic math verification, Pandas and the HuggingFace Datasets library for benchmark data loading, and NumPy for numerical operations within the MCTS and BFS search algorithms.

Usage

Use this environment for all Forest-of-Thought workflows: FoT Benchmark Evaluation (MCTS/CoT/ToT on GSM8K, MATH500, AIME), Game24 Forest Solving (BFS-based), and CGDM Post-Processing (LLM-as-judge). Every implementation in this repository requires this runtime since model loading is a universal prerequisite.

System Requirements

Category	Requirement	Notes
OS	Linux (Ubuntu recommended)	Conda environment setup documented in README
Hardware	NVIDIA GPU with CUDA support	Device is hardcoded to `"cuda"` in `models/load_local_model.py:L15`
VRAM	Minimum 16GB (40GB+ recommended)	7B models in bfloat16 require ~14GB; larger models (QwQ-32B) require 40GB+
Python	3.10	Specified in README conda create command
CUDA	>= 11.7	Specified in README requirements section
Disk	50GB+ SSD	Model weights (8B model ~16GB), datasets, and output logs

Dependencies

System Packages

CUDA Toolkit >= 11.7
`conda` (for virtual environment management)
`git` (for cloning repository)

Python Packages

`torch` == 2.3.0
`transformers` == 4.41.2
`datasets` == 3.1.0
`sympy` == 1.12
`numpy` == 1.24.3
`pandas` == 2.0.3
`tqdm` == 4.65.0
`openai` == 0.27.7
`aiohttp` == 3.8.4
`backoff` == 2.2.1
`requests` == 2.31.0
`mpmath` == 1.3.0

Credentials

No mandatory credentials for the local model inference path. For the optional OpenAI GPT-based path (models/models.py), see the Environment:Iamhankai_Forest_of_Thought_OpenAI_API_Credentials environment page.

Quick Install

# Create conda environment
conda create -n fot python=3.10 -y
conda activate fot

# Install all required packages
pip install -r requirements.txt

# Or install individually:
pip install torch==2.3.0 transformers==4.41.2 datasets==3.1.0 sympy==1.12 numpy==1.24.3 pandas==2.0.3 tqdm==4.65.0 openai==0.27.7 aiohttp==3.8.4 backoff==2.2.1 requests==2.31.0 mpmath==1.3.0

Code Evidence

CUDA device hardcoded in `models/load_local_model.py:L15`:

self.device = "cuda"

Model loading with bfloat16 and auto device mapping in `models/load_local_model.py:L40-46`:

self.model = AutoModelForCausalLM.from_pretrained(
    self.model_id,
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True,
    device_map='auto',
).eval()

Pipeline mode with float16 in `models/load_local_model.py:L31-37`:

self.pipeline = transformers.pipeline(
    "text-generation",
    model=self.model_id,
    model_kwargs={"torch_dtype": torch.float16},
    device_map='auto',
    trust_remote_code=True,
)

CUDA GPU requirement enforced via `scripts/game24/run.py:L9`:

os.environ["CUDA_VISIBLE_DEVICES"] = "0"

Python and CUDA requirements from `README.md:L13-17`:

Python == 3.10
CUDA Version >= 11.7
pip install -r requirements.txt

Common Errors

Error Message	Cause	Solution
`ValueError: Input length of input_ids is X, but max_length is set to Y`	Input prompt exceeds the configured `max_length`	The code auto-recovers by parsing the error and extending `max_length` by 100 (see `load_local_model.py:L73-75`)
`RuntimeError: CUDA out of memory`	Insufficient GPU VRAM for the model	Use a smaller model, reduce `max_new_tokens`, or use a GPU with more VRAM
`torch.cuda.is_available()` returns `False`	No CUDA-capable GPU found	Install CUDA toolkit >= 11.7 and verify GPU drivers with `nvidia-smi`
`ImportError: No module named 'transformers'`	Missing Python dependency	Run `pip install -r requirements.txt`

Compatibility Notes

GPU Required: CPU-only execution is not supported. The device is hardcoded to `"cuda"` without a CPU fallback path.
Multi-GPU: Supported via `device_map='auto'` which uses HuggingFace Accelerate for automatic model sharding across available GPUs.
Model Architectures: Supports LLaMA, Qwen, GLM, DeepSeek, and Mistral model families via architecture-specific inference paths in `Pipeline`.
trust_remote_code: All model loading uses `trust_remote_code=True`, meaning custom model code from HuggingFace Hub will be executed. Only use trusted model repositories.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment