Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:LLMBook zh LLMBook zh github io AutoModelForCausalLM From Pretrained SFT

From Leeroopedia


Knowledge Sources
Domains NLP, Deep_Learning
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for loading pre-trained models with FlashAttention-2 for supervised fine-tuning provided by HuggingFace Transformers.

Description

In the SFT context, AutoModelForCausalLM.from_pretrained loads a base model for full fine-tuning on instruction-response data. The call signature is identical to pre-training model loading, but the downstream usage involves SFTDataset and DataCollatorForSupervisedDataset.

This is a Wrapper Doc documenting how the LLMBook repository uses AutoModelForCausalLM for supervised fine-tuning specifically.

Usage

Use this to load the base model before passing it to Trainer with SFTDataset and DataCollatorForSupervisedDataset.

Code Reference

Source Location

  • Repository: LLMBook-zh
  • File: code/7.1 SFT实践.py
  • Lines: 76

Signature

model = AutoModelForCausalLM.from_pretrained(
    model_name_or_path: str,
    attn_implementation: str = "flash_attention_2"
) -> PreTrainedModel

Import

from transformers import AutoModelForCausalLM

I/O Contract

Inputs

Name Type Required Description
model_name_or_path str Yes HuggingFace model ID or local path
attn_implementation str No Attention backend ("flash_attention_2")

Outputs

Name Type Description
return PreTrainedModel Model initialized with pre-trained weights for SFT

Usage Examples

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    attn_implementation="flash_attention_2"
)
# Model is now ready for supervised fine-tuning

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment