Implementation:LLMBook zh LLMBook zh github io AutoModelForCausalLM From Pretrained Pretraining
| Knowledge Sources | |
|---|---|
| Domains | NLP, Deep_Learning |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for loading pre-trained causal language models with FlashAttention-2 for pre-training provided by HuggingFace Transformers.
Description
AutoModelForCausalLM.from_pretrained automatically loads the correct model architecture and pre-trained weights based on the model name or path. In the pre-training context of this repository, it is used to load LLaMA-2 models with FlashAttention-2 enabled for efficient training.
This is a Wrapper Doc — it documents how the LLMBook repository uses the HuggingFace Transformers external API.
Usage
Use this when loading a base model for continued pre-training. Pass the model to HuggingFace Trainer along with a PTDataset.
Code Reference
Source Location
- Repository: LLMBook-zh
- File: code/6.2 预训练实践.py
- Lines: 55
Signature
model = AutoModelForCausalLM.from_pretrained(
model_name_or_path: str,
attn_implementation: str = "flash_attention_2"
) -> PreTrainedModel
Import
from transformers import AutoModelForCausalLM
External Reference
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name_or_path | str | Yes | HuggingFace model ID or local path (e.g., "meta-llama/Llama-2-7b-hf") |
| attn_implementation | str | No | Attention backend (default "flash_attention_2") |
Outputs
| Name | Type | Description |
|---|---|---|
| return | PreTrainedModel | Initialized model with pre-trained weights |
Usage Examples
from transformers import AutoModelForCausalLM
# Load LLaMA-2 with FlashAttention-2 for pre-training
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-hf",
attn_implementation="flash_attention_2"
)