Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Open r1 Trainer Save and Push

From Leeroopedia
Revision as of 15:10, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Huggingface_Open_r1_Trainer_Save_and_Push.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains NLP, Infrastructure
Last Updated 2026-02-08 00:00 GMT

Overview

Wrapper for HuggingFace Trainer's save_model and push_to_hub methods with Open-R1-specific model card creation and KV cache restoration.

Description

This is a Wrapper Doc. The saving logic is identical in both sft.py and grpo.py:

  1. Align the model generation config EOS token with the tokenizer's EOS token to prevent unbounded generation.
  2. Call trainer.save_model(output_dir) to persist model weights and tokenizer configuration to disk.
  3. On the main process only: create a model card with dataset_name and the "open-r1" tag, restore use_cache = True for inference efficiency, and save the updated config to the output directory.
  4. Optionally push the saved model to HuggingFace Hub with the same kwargs (dataset name and tags).

The implementation also supports evaluation after saving when do_eval is True.

Usage

Invoked at the end of the main() function in both sft.py and grpo.py. This is the final step of every training run and executes only after the training loop (and optional evaluation) has completed.

Code Reference

Property Value
Repository open-r1
File (SFT) src/open_r1/sft.py
Lines (SFT) L127-163
File (GRPO) src/open_r1/grpo.py
Lines (GRPO) L139-175
Import Part of training script; uses from trl import SFTTrainer or from trl import GRPOTrainer

Signature

# Model saving sequence (from sft.py):
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
trainer.save_model(training_args.output_dir)

kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
    trainer.create_model_card(**kwargs)
    trainer.model.config.use_cache = True
    trainer.model.config.save_pretrained(training_args.output_dir)

if training_args.push_to_hub:
    trainer.push_to_hub(**kwargs)

I/O Contract

Inputs

Parameter Type Required Description
trainer GRPOTrainer Yes Trained trainer instance from the completed training loop; provides save_model, create_model_card, push_to_hub, and access to accelerator.is_main_process
training_args GRPOConfig Yes Training configuration containing:
  • output_dir -- directory path where model files are saved
  • push_to_hub -- boolean flag controlling whether to push to HuggingFace Hub
script_args ScriptArguments Yes Script-level arguments containing:
  • dataset_name -- name of the training dataset for the model card
tokenizer PreTrainedTokenizer Yes Tokenizer used during training; provides eos_token_id for generation config alignment

Outputs

Output Description
Model files on disk Model weights, tokenizer files, and configuration saved to training_args.output_dir
Model card Auto-generated model card with dataset name and "open-r1" tag, written to the output directory
Updated config Model config with use_cache = True restored, saved to the output directory
Hub upload (Optional) Model, config, and model card pushed to HuggingFace Hub with tags when push_to_hub is enabled

Usage Examples

Save Sequence from sft.py

# At the end of main() in sft.py, after training completes:

# 1. Align EOS token for generation
trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id

# 2. Save model to disk
trainer.save_model(training_args.output_dir)

# 3. Create model card and restore KV cache (main process only)
kwargs = {"dataset_name": script_args.dataset_name, "tags": ["open-r1"]}
if trainer.accelerator.is_main_process:
    trainer.create_model_card(**kwargs)
    trainer.model.config.use_cache = True
    trainer.model.config.save_pretrained(training_args.output_dir)

# 4. Optionally push to Hub
if training_args.push_to_hub:
    trainer.push_to_hub(**kwargs)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment