Implementation:Huggingface Open r1 Trainer Save and Push
Appearance
| Knowledge Sources | |
|---|---|
| Domains | NLP, Infrastructure |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Wrapper for HuggingFace Trainer's save_model and push_to_hub methods with Open-R1-specific model card creation and KV cache restoration.
Description
This is a Wrapper Doc. The saving logic is identical in both sft.py and grpo.py:
- Align the model generation config EOS token with the tokenizer's EOS token to prevent unbounded generation.
- Call
trainer.save_model(output_dir)to persist model weights and tokenizer configuration to disk. - On the main process only: create a model card with
dataset_nameand the"open-r1"tag, restoreuse_cache = Truefor inference efficiency, and save the updated config to the output directory. - Optionally push the saved model to HuggingFace Hub with the same kwargs (dataset name and tags).
The implementation also supports evaluation after saving when do_eval is True.
Usage
Invoked at the end of the main() function in both sft.py and grpo.py. This is the final step of every training run and executes only after the training loop (and optional evaluation) has completed.
Code Reference
| Property | Value |
|---|---|
| Repository | open-r1 |
| File (SFT) | src/open_r1/sft.py
|
| Lines (SFT) | L127-163 |
| File (GRPO) | src/open_r1/grpo.py
|
| Lines (GRPO) | L139-175 |
| Import | Part of training script; uses from trl import SFTTrainer or from trl import GRPOTrainer
|
Signature
# Model saving sequence (from sft.py):
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
trainer.save_model(training_args.output_dir)
kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
trainer.create_model_card(**kwargs)
trainer.model.config.use_cache = True
trainer.model.config.save_pretrained(training_args.output_dir)
if training_args.push_to_hub:
trainer.push_to_hub(**kwargs)
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
trainer |
GRPOTrainer | Yes | Trained trainer instance from the completed training loop; provides save_model, create_model_card, push_to_hub, and access to accelerator.is_main_process
|
training_args |
GRPOConfig | Yes | Training configuration containing:
|
script_args |
ScriptArguments |
Yes | Script-level arguments containing:
|
tokenizer |
PreTrainedTokenizer |
Yes | Tokenizer used during training; provides eos_token_id for generation config alignment
|
Outputs
| Output | Description |
|---|---|
| Model files on disk | Model weights, tokenizer files, and configuration saved to training_args.output_dir
|
| Model card | Auto-generated model card with dataset name and "open-r1" tag, written to the output directory
|
| Updated config | Model config with use_cache = True restored, saved to the output directory
|
| Hub upload | (Optional) Model, config, and model card pushed to HuggingFace Hub with tags when push_to_hub is enabled
|
Usage Examples
Save Sequence from sft.py
# At the end of main() in sft.py, after training completes:
# 1. Align EOS token for generation
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
# 2. Save model to disk
trainer.save_model(training_args.output_dir)
# 3. Create model card and restore KV cache (main process only)
kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
trainer.create_model_card(**kwargs)
trainer.model.config.use_cache = True
trainer.model.config.save_pretrained(training_args.output_dir)
# 4. Optionally push to Hub
if training_args.push_to_hub:
trainer.push_to_hub(**kwargs)
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment