Environment:Eric mitchell Direct preference optimization Python Dependencies

Knowledge Sources	Direct Preference Optimization
Domains	Infrastructure
Last Updated	2026-02-08 02:00 GMT

Overview

Python 3.8+ environment with Hydra 1.3.2, HuggingFace Datasets 2.12.0, W&B 0.15.3, and supporting libraries for configuration, data loading, and experiment tracking.

Description

This environment provides the Python runtime and non-PyTorch dependencies required to run the DPO training pipeline. It includes Hydra for composable YAML-based configuration with CLI overrides, HuggingFace Datasets for downloading and caching preference datasets (Anthropic-HH, Stanford Human Preferences, StackExchange), Weights & Biases for experiment logging, and BeautifulSoup4 for HTML parsing of StackExchange data. These packages form the data and configuration layer of the training pipeline.

Usage

Use this environment for configuration management (Hydra-based config resolution and CLI overrides) and data pipeline operations (dataset downloading, caching, HTML stripping, batch iteration). Required as a prerequisite for any training or evaluation workflow.

System Requirements

Category	Requirement	Notes
OS	Linux, macOS, or Windows	Cross-platform Python packages
Python	3.8+	Recommended in README
Network	Internet access (first run)	Required to download datasets from HuggingFace Hub; cached locally after first download
Disk	10GB+	For dataset caches (Anthropic-HH ~300MB, SHP ~2GB, SE ~10GB+)

Dependencies

Python Packages

`hydra-core` == 1.3.2
`datasets` == 2.12.0
`wandb` == 0.15.3
`beautifulsoup4` == 4.12.2
`ipykernel` == 6.23.1

Credentials

The following environment variables may be needed:

`WANDB_API_KEY`: Weights & Biases API key for experiment logging. Required if `wandb.enabled=true` (default).
`WANDB_CACHE_DIR`: Set automatically by code (train.py:L31) to local cache directory.

Quick Install

# Create virtual environment (recommended)
python3 -m venv env
source env/bin/activate

# Install all dependencies
pip install -r requirements.txt

# Or install individually
pip install hydra-core==1.3.2 datasets==2.12.0 wandb==0.15.3 beautifulsoup4==4.12.2 ipykernel==6.23.1

Code Evidence

Hydra configuration entry point in `train.py:48-53`:

@hydra.main(version_base=None, config_path="config", config_name="config")
def main(config: DictConfig):
    """Main entry point for training."""
    OmegaConf.resolve(config)
    missing_keys: Set[str] = OmegaConf.missing_keys(config)
    if missing_keys:
        raise ValueError(f"Got missing keys in config:\n{missing_keys}")

Dataset loading via HuggingFace Datasets in `preference_datasets.py:141-142`:

print(f'Loading HH dataset ({split} split) from Huggingface...')
dataset = datasets.load_dataset('Anthropic/hh-rlhf', split=split, cache_dir=cache_dir)

W&B initialization in `train.py:30-38`:

if rank == 0 and config.wandb.enabled:
    os.environ['WANDB_CACHE_DIR'] = get_local_dir(config.local_dirs)
    wandb.init(
        entity=config.wandb.entity,
        project=config.wandb.project,
        config=OmegaConf.to_container(config),
        dir=get_local_dir(config.local_dirs),
        name=config.exp_name,
    )

BeautifulSoup HTML stripping for StackExchange data in `preference_datasets.py:23-43`:

def strip_html_tags(html_string):
    """Strip HTML tags from a string, except for <code> tags."""
    soup = BeautifulSoup(html_string, 'html.parser')
    text = []
    for element in soup.children:
        if isinstance(element, NavigableString):
            continue
        if element.name == 'p':
            text.append(''.join(child.string for child in element.children if isinstance(child, NavigableString)))

Common Errors

Error Message	Cause	Solution
`ValueError: Got missing keys in config`	Required Hydra config keys not provided	Pass all required keys via CLI overrides (e.g., `exp_name=my_experiment`)
`wandb.errors.UsageError: api_key not configured`	W&B API key not set	Run `wandb login` or set `WANDB_API_KEY`, or disable with `wandb.enabled=false`
`ConnectionError: Couldn't reach HuggingFace Hub`	No internet access for dataset download	Ensure network access or pre-download datasets to `cache_dir`

Compatibility Notes

Debug Mode: Setting `debug=true` disables W&B logging and model checkpointing (train.py:L27-28).
Local Dirs: The code tries directories in order from `config.local_dirs` (default: `/scr-ssd`, `/scr`, `.cache`) and uses the first one that exists.
Dataset Caching: All datasets are cached locally after first download. Cache location is determined by `local_dirs` config.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment