Environment:LLMBook zh LLMBook zh github io HuggingFace Transformers Stack
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, NLP, LLMs |
| Last Updated | 2026-02-08 04:30 GMT |
Overview
Hugging Face ecosystem environment including transformers, peft, trl, and datasets libraries for LLM training, fine-tuning, and alignment.
Description
This environment provides the Hugging Face software stack used across all training and fine-tuning workflows. The transformers library supplies AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments, and HfArgumentParser used in pre-training, SFT, LoRA, and DPO scripts. The peft library provides LoraConfig and get_peft_model for parameter-efficient fine-tuning. The trl library provides DPOTrainer for preference alignment. The datasets library handles data loading via load_dataset. FlashAttention 2 integration is used via attn_implementation="flash_attention_2".
Usage
Use this environment for all model loading, training, fine-tuning, and alignment workflows. Required by every script that loads models with AutoModelForCausalLM.from_pretrained() or trains with Trainer/DPOTrainer.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (Ubuntu recommended) | Full support for all HF libraries |
| Hardware | NVIDIA GPU | Required for FlashAttention 2 (Ampere+ architecture) |
| Python | Python >= 3.8 | Required by transformers |
| Disk | 30GB+ | For cached model weights from Hugging Face Hub |
Dependencies
Python Packages
- `transformers` >= 4.30
- `peft` >= 0.4.0
- `trl` >= 0.5.0
- `datasets` >= 2.14
- `accelerate` >= 0.20
- `flash-attn` >= 2.0 (for FlashAttention 2 support)
- `deepspeed` >= 0.9 (optional, for distributed training)
Credentials
The following environment variables may be needed:
- `HF_TOKEN`: Hugging Face API token for accessing gated models (e.g., LLaMA-2 requires access approval).
Quick Install
# Install the full Hugging Face stack
pip install transformers peft trl datasets accelerate
# For FlashAttention 2 support (requires CUDA)
pip install flash-attn --no-build-isolation
# Optional: DeepSpeed for distributed training
pip install deepspeed
Code Evidence
Transformers imports from `code/6.2 预训练实践.py:3-9`:
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
HfArgumentParser,
TrainingArguments,
Trainer,
)
FlashAttention 2 usage from `code/6.2 预训练实践.py:55`:
model = AutoModelForCausalLM.from_pretrained(
args.model_name_or_path, attn_implementation="flash_attention_2"
)
PEFT imports from `code/7.4 LoRA实践.py:3-8`:
from peft import (
LoraConfig,
TaskType,
AutoPeftModelForCausalLM,
get_peft_model,
)
DeepSpeed integration from `code/7.4 LoRA实践.py:9-12`:
from transformers.integrations.deepspeed import (
is_deepspeed_zero3_enabled,
unset_hf_deepspeed_config,
)
TRL DPOTrainer from `code/8.2 DPO实践.py:5`:
from trl import DPOTrainer
Datasets library from `code/6.3 预训练数据类.py:2`:
from datasets import load_dataset
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `OSError: meta-llama/Llama-2-7b-hf is gated` | Model requires access approval | Accept license on HuggingFace Hub and set `HF_TOKEN` |
| `ImportError: flash_attn not found` | FlashAttention not installed | `pip install flash-attn --no-build-isolation` |
| `ImportError: peft not found` | PEFT library not installed | `pip install peft` |
| `ImportError: trl not found` | TRL library not installed | `pip install trl` |
Compatibility Notes
- FlashAttention 2: Only supported on Ampere (A100/A10) and newer NVIDIA GPUs. Falls back to standard attention on older hardware.
- DeepSpeed Zero-3: When merging LoRA adapters after Zero-3 training, must call `unset_hf_deepspeed_config()` first (see `code/7.4 LoRA实践.py:47`).
- GPTQConfig: Requires `auto-gptq` package additionally for GPTQ quantization workflow.
Related Pages
- Implementation:LLMBook_zh_LLMBook_zh_github_io_AutoModelForCausalLM_From_Pretrained_Pretraining
- Implementation:LLMBook_zh_LLMBook_zh_github_io_Trainer_Train_Pretraining
- Implementation:LLMBook_zh_LLMBook_zh_github_io_Trainer_Save_Model_Pretraining
- Implementation:LLMBook_zh_LLMBook_zh_github_io_PTDataset
- Implementation:LLMBook_zh_LLMBook_zh_github_io_AutoModelForCausalLM_From_Pretrained_SFT
- Implementation:LLMBook_zh_LLMBook_zh_github_io_Trainer_Train_SFT
- Implementation:LLMBook_zh_LLMBook_zh_github_io_DataCollatorForSupervisedDataset
- Implementation:LLMBook_zh_LLMBook_zh_github_io_LoraConfig_Get_Peft_Model
- Implementation:LLMBook_zh_LLMBook_zh_github_io_Trainer_Train_LoRA
- Implementation:LLMBook_zh_LLMBook_zh_github_io_AutoPeftModelForCausalLM_Merge_And_Unload
- Implementation:LLMBook_zh_LLMBook_zh_github_io_LlamaRewardModel
- Implementation:LLMBook_zh_LLMBook_zh_github_io_Get_Data_DPO
- Implementation:LLMBook_zh_LLMBook_zh_github_io_AutoModelForCausalLM_From_Pretrained_DPO
- Implementation:LLMBook_zh_LLMBook_zh_github_io_DPOTrainer_Train
- Implementation:LLMBook_zh_LLMBook_zh_github_io_AutoModelForCausalLM_From_Pretrained_Bitsandbytes
- Implementation:LLMBook_zh_LLMBook_zh_github_io_GPTQConfig_Quantization