Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Allenai Open instruct Push Folder To Hub

From Leeroopedia


Knowledge Sources
Domains Machine Learning, MLOps, Open Science
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for uploading a model checkpoint directory to the HuggingFace Hub provided by the Open Instruct library.

Description

The push_folder_to_hub() function uploads an entire directory of model artifacts to a HuggingFace Hub repository. It handles the full upload workflow:

  1. Repository creation: If the target repository does not exist, it creates a new private repository.
  2. Branch creation: If a specific revision (branch) is specified, it creates the branch if it does not exist.
  3. Folder upload: Uploads all files in the output directory to the repository in a single commit with the message "upload checkpoint".
  4. Retry logic: The function is decorated with @retry_on_exception() to handle transient network failures during upload.

The function is designed to be called only on the main process (rank 0) in distributed training. Callers are expected to gate calls with if accelerator.is_main_process.

Usage

Call this function after saving a model checkpoint to upload it to the HuggingFace Hub. It is typically called at the end of training in finetune.py when push_to_hub=True and a save_to_hub or hf_repo_id is specified.

Code Reference

Source Location

  • Repository: Open Instruct
  • File: open_instruct/model_utils.py
  • Lines: L554-573

Signature

@retry_on_exception()
def push_folder_to_hub(
    output_dir: str,
    hf_repo_id: str | None = None,
    hf_repo_revision: str | None = None,
    private: bool = True,
) -> None:

Import

from open_instruct.model_utils import push_folder_to_hub

I/O Contract

Inputs

Name Type Required Description
output_dir str Yes Local directory containing the model artifacts to upload (weights, tokenizer, config files).
hf_repo_id str or None Yes (practically) The HuggingFace Hub repository ID in the format "{entity}/{model_name}" (e.g., "allenai/tulu-3-8b-sft").
hf_repo_revision str or None No Branch or tag name in the repository. If specified, creates the branch if needed and uploads to it.
private bool No Whether the repository should be private. Defaults to True.

Outputs

Name Type Description
(side effects) None Uploads all files from output_dir to the HuggingFace Hub repository. Creates the repo and branch if they do not exist. Logs the URL of the uploaded model.

Usage Examples

Basic Usage

from open_instruct.model_utils import push_folder_to_hub

# Upload after training
push_folder_to_hub(
    output_dir="output/my_sft_model",
    hf_repo_id="allenai/my-tulu-model",
    hf_repo_revision="sft-v1",
    private=True,
)
# Logs: pushed to https://huggingface.co/allenai/my-tulu-model/tree/sft-v1

Integration in Training Loop

# At the end of training in finetune.py:
if accelerator.is_main_process:
    if args.push_to_hub:
        push_folder_to_hub(
            output_dir=args.output_dir,
            hf_repo_id=args.hf_repo_id,
            hf_repo_revision=args.hf_repo_revision,
        )

Dependencies

  • huggingface_hub -- provides HfApi for repository management and file upload

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment