Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Eric mitchell Direct preference optimization HuggingFace Transformers

From Leeroopedia


Knowledge Sources
Domains Infrastructure, NLP
Last Updated 2026-02-08 02:00 GMT

Overview

HuggingFace Transformers 4.29.2 environment with tokenizers 0.13.3 for model loading, tokenization, and text generation.

Description

This environment provides the HuggingFace ecosystem required for loading pre-trained causal language models, tokenizing preference data, and generating text samples during evaluation. It includes the `transformers` library for `AutoModelForCausalLM` and `AutoTokenizer`, and the `tokenizers` library for fast tokenization. Models are loaded with configurable dtypes (`float32`, `bfloat16`, `float16`) and optional `device_map='balanced'` for multi-GPU distribution.

Usage

Use this environment for model loading (loading pre-trained causal LMs from HuggingFace Hub or local paths) and tokenization (converting prompt/response text to token IDs with truncation and label masking). Required by all workflows that load models or process text data.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Cross-platform Python package
Network Internet access (first run) Required to download models and datasets from HuggingFace Hub; cached locally after first download
Disk 10-50GB+ Depends on model size; GPT-2 XL ~6GB, Pythia 6.9B ~14GB, LLaMA 7B ~14GB

Dependencies

Python Packages

  • `transformers` == 4.29.2
  • `tokenizers` == 0.13.3

Credentials

The following environment variables may be needed:

  • `HF_TOKEN`: HuggingFace API token if accessing gated models (e.g., LLaMA). Not explicitly referenced in code but required by the `transformers` library for gated model downloads.
  • `XDG_CACHE_HOME`: Set automatically by code (train.py:L79) to control where models are cached locally.

Quick Install

pip install transformers==4.29.2 tokenizers==0.13.3

Code Evidence

Model loading with dtype configuration in `train.py:80-84`:

print('building policy')
model_kwargs = {'device_map': 'balanced'} if config.trainer == 'BasicTrainer' else {}
policy_dtype = getattr(torch, config.model.policy_dtype)
policy = transformers.AutoModelForCausalLM.from_pretrained(
    config.model.name_or_path, cache_dir=get_local_dir(config.local_dirs), low_cpu_mem_usage=True, torch_dtype=policy_dtype, **model_kwargs)

Tokenizer loading with pad token fallback in `trainers.py:158-162`:

tokenizer_name_or_path = config.model.tokenizer_name_or_path or config.model.name_or_path
rank0_print(f'Loading tokenizer {tokenizer_name_or_path}')
self.tokenizer = transformers.AutoTokenizer.from_pretrained(tokenizer_name_or_path, cache_dir=get_local_dir(config.local_dirs))
if self.tokenizer.pad_token_id is None:
    self.tokenizer.pad_token_id = self.tokenizer.eos_token_id

Text generation during evaluation in `trainers.py:188-189`:

policy_output = self.policy.generate(
    batch['prompt_input_ids'], attention_mask=batch['prompt_attention_mask'], max_length=self.config.max_length, do_sample=True, pad_token_id=self.tokenizer.pad_token_id)

Common Errors

Error Message Cause Solution
`OSError: Can't load tokenizer for 'model_name'` Model not found or no internet access Ensure model name is correct and network is available, or use a local path
`AssertionError: Prompt contains EOS token` Input text contains the model's EOS token Preprocess data to remove EOS tokens from prompts before tokenization
`ValueError: Could not find block class X` Block class name mismatch when using FSDP with custom model Check model architecture for correct transformer block class name (e.g., `GPT2Block`, `GPTNeoXLayer`)

Compatibility Notes

  • Supported Models: Any HuggingFace `AutoModelForCausalLM`-compatible model. Pre-configured: GPT-2 Large, GPT-2 XL, GPT-J 6B, Pythia 2.8B, Pythia 6.9B, LLaMA 7B.
  • Custom Models: Use `model=blank_model` with `model.name_or_path=NAME_OR_PATH` and optionally `model.tokenizer_name_or_path` if tokenizer path differs.
  • Pad Token: If the tokenizer has no pad token, the code automatically sets `pad_token_id = eos_token_id` (trainers.py:L161-162).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment