Implementation:Huggingface Open r1 Trainer Save and Push

Knowledge Sources	Open R1 Transformers Trainer
Domains	NLP, Infrastructure
Last Updated	2026-02-08 00:00 GMT

Overview

Wrapper for HuggingFace Trainer's save_model and push_to_hub methods with Open-R1-specific model card creation and KV cache restoration.

Description

This is a Wrapper Doc. The saving logic is identical in both sft.py and grpo.py:

Align the model generation config EOS token with the tokenizer's EOS token to prevent unbounded generation.
Call trainer.save_model(output_dir) to persist model weights and tokenizer configuration to disk.
On the main process only: create a model card with dataset_name and the "open-r1" tag, restore use_cache = True for inference efficiency, and save the updated config to the output directory.
Optionally push the saved model to HuggingFace Hub with the same kwargs (dataset name and tags).

The implementation also supports evaluation after saving when do_eval is True.

Usage

Invoked at the end of the main() function in both sft.py and grpo.py. This is the final step of every training run and executes only after the training loop (and optional evaluation) has completed.

Code Reference

Property	Value
Repository	open-r1
File (SFT)	`src/open_r1/sft.py`
Lines (SFT)	L127-163
File (GRPO)	`src/open_r1/grpo.py`
Lines (GRPO)	L139-175
Import	Part of training script; uses `from trl import SFTTrainer` or `from trl import GRPOTrainer`

Signature

# Model saving sequence (from sft.py):
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
trainer.save_model(training_args.output_dir)

kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
    trainer.create_model_card(**kwargs)
    trainer.model.config.use_cache = True
    trainer.model.config.save_pretrained(training_args.output_dir)

if training_args.push_to_hub:
    trainer.push_to_hub(**kwargs)

I/O Contract

Inputs

Parameter	Type	Required	Description
`trainer`	GRPOTrainer	Yes	Trained trainer instance from the completed training loop; provides `save_model`, `create_model_card`, `push_to_hub`, and access to `accelerator.is_main_process`
`training_args`	GRPOConfig	Yes	Training configuration containing: `output_dir` -- directory path where model files are saved `push_to_hub` -- boolean flag controlling whether to push to HuggingFace Hub
`script_args`	`ScriptArguments`	Yes	Script-level arguments containing: `dataset_name` -- name of the training dataset for the model card
`tokenizer`	`PreTrainedTokenizer`	Yes	Tokenizer used during training; provides `eos_token_id` for generation config alignment

Outputs

Output	Description
Model files on disk	Model weights, tokenizer files, and configuration saved to `training_args.output_dir`
Model card	Auto-generated model card with dataset name and `"open-r1"` tag, written to the output directory
Updated config	Model config with `use_cache = True` restored, saved to the output directory
Hub upload	(Optional) Model, config, and model card pushed to HuggingFace Hub with tags when `push_to_hub` is enabled

Usage Examples

Save Sequence from sft.py

# At the end of main() in sft.py, after training completes:

# 1. Align EOS token for generation
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id

# 2. Save model to disk
trainer.save_model(training_args.output_dir)

# 3. Create model card and restore KV cache (main process only)
kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
    trainer.create_model_card(**kwargs)
    trainer.model.config.use_cache = True
    trainer.model.config.save_pretrained(training_args.output_dir)

# 4. Optionally push to Hub
if training_args.push_to_hub:
    trainer.push_to_hub(**kwargs)

Related Pages

Principle:Huggingface_Open_r1_Model_Saving_and_Publishing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment