Environment:Eric mitchell Direct preference optimization HuggingFace Transformers

Knowledge Sources	Direct Preference Optimization HuggingFace Transformers
Domains	Infrastructure, NLP
Last Updated	2026-02-08 02:00 GMT

Overview

HuggingFace Transformers 4.29.2 environment with tokenizers 0.13.3 for model loading, tokenization, and text generation.

Description

This environment provides the HuggingFace ecosystem required for loading pre-trained causal language models, tokenizing preference data, and generating text samples during evaluation. It includes the `transformers` library for `AutoModelForCausalLM` and `AutoTokenizer`, and the `tokenizers` library for fast tokenization. Models are loaded with configurable dtypes (`float32`, `bfloat16`, `float16`) and optional `device_map='balanced'` for multi-GPU distribution.

Usage

Use this environment for model loading (loading pre-trained causal LMs from HuggingFace Hub or local paths) and tokenization (converting prompt/response text to token IDs with truncation and label masking). Required by all workflows that load models or process text data.

System Requirements

Category	Requirement	Notes
OS	Linux, macOS, or Windows	Cross-platform Python package
Network	Internet access (first run)	Required to download models and datasets from HuggingFace Hub; cached locally after first download
Disk	10-50GB+	Depends on model size; GPT-2 XL ~6GB, Pythia 6.9B ~14GB, LLaMA 7B ~14GB

Dependencies

Python Packages

`transformers` == 4.29.2
`tokenizers` == 0.13.3

Credentials

The following environment variables may be needed:

`HF_TOKEN`: HuggingFace API token if accessing gated models (e.g., LLaMA). Not explicitly referenced in code but required by the `transformers` library for gated model downloads.
`XDG_CACHE_HOME`: Set automatically by code (train.py:L79) to control where models are cached locally.

Quick Install

pip install transformers==4.29.2 tokenizers==0.13.3

Code Evidence

Model loading with dtype configuration in `train.py:80-84`:

print('building policy')
model_kwargs = {'device_map': 'balanced'} if config.trainer == 'BasicTrainer' else {}
policy_dtype = getattr(torch, config.model.policy_dtype)
policy = transformers.AutoModelForCausalLM.from_pretrained(
    config.model.name_or_path, cache_dir=get_local_dir(config.local_dirs), low_cpu_mem_usage=True, torch_dtype=policy_dtype, **model_kwargs)

Tokenizer loading with pad token fallback in `trainers.py:158-162`:

tokenizer_name_or_path = config.model.tokenizer_name_or_path or config.model.name_or_path
rank0_print(f'Loading tokenizer {tokenizer_name_or_path}')
self.tokenizer = transformers.AutoTokenizer.from_pretrained(tokenizer_name_or_path, cache_dir=get_local_dir(config.local_dirs))
if self.tokenizer.pad_token_id is None:
    self.tokenizer.pad_token_id = self.tokenizer.eos_token_id

Text generation during evaluation in `trainers.py:188-189`:

policy_output = self.policy.generate(
    batch['prompt_input_ids'], attention_mask=batch['prompt_attention_mask'], max_length=self.config.max_length, do_sample=True, pad_token_id=self.tokenizer.pad_token_id)

Common Errors

Error Message	Cause	Solution
`OSError: Can't load tokenizer for 'model_name'`	Model not found or no internet access	Ensure model name is correct and network is available, or use a local path
`AssertionError: Prompt contains EOS token`	Input text contains the model's EOS token	Preprocess data to remove EOS tokens from prompts before tokenization
`ValueError: Could not find block class X`	Block class name mismatch when using FSDP with custom model	Check model architecture for correct transformer block class name (e.g., `GPT2Block`, `GPTNeoXLayer`)

Compatibility Notes

Supported Models: Any HuggingFace `AutoModelForCausalLM`-compatible model. Pre-configured: GPT-2 Large, GPT-2 XL, GPT-J 6B, Pythia 2.8B, Pythia 6.9B, LLaMA 7B.
Custom Models: Use `model=blank_model` with `model.name_or_path=NAME_OR_PATH` and optionally `model.tokenizer_name_or_path` if tokenizer path differs.
Pad Token: If the tokenizer has no pad token, the code automatically sets `pad_token_id = eos_token_id` (trainers.py:L161-162).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment