Implementation:Allenai Open instruct Push Folder To Hub
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, MLOps, Open Science |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for uploading a model checkpoint directory to the HuggingFace Hub provided by the Open Instruct library.
Description
The push_folder_to_hub() function uploads an entire directory of model artifacts to a HuggingFace Hub repository. It handles the full upload workflow:
- Repository creation: If the target repository does not exist, it creates a new private repository.
- Branch creation: If a specific revision (branch) is specified, it creates the branch if it does not exist.
- Folder upload: Uploads all files in the output directory to the repository in a single commit with the message "upload checkpoint".
- Retry logic: The function is decorated with
@retry_on_exception()to handle transient network failures during upload.
The function is designed to be called only on the main process (rank 0) in distributed training. Callers are expected to gate calls with if accelerator.is_main_process.
Usage
Call this function after saving a model checkpoint to upload it to the HuggingFace Hub. It is typically called at the end of training in finetune.py when push_to_hub=True and a save_to_hub or hf_repo_id is specified.
Code Reference
Source Location
- Repository: Open Instruct
- File:
open_instruct/model_utils.py - Lines: L554-573
Signature
@retry_on_exception()
def push_folder_to_hub(
output_dir: str,
hf_repo_id: str | None = None,
hf_repo_revision: str | None = None,
private: bool = True,
) -> None:
Import
from open_instruct.model_utils import push_folder_to_hub
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| output_dir | str | Yes | Local directory containing the model artifacts to upload (weights, tokenizer, config files). |
| hf_repo_id | str or None | Yes (practically) | The HuggingFace Hub repository ID in the format "{entity}/{model_name}" (e.g., "allenai/tulu-3-8b-sft").
|
| hf_repo_revision | str or None | No | Branch or tag name in the repository. If specified, creates the branch if needed and uploads to it. |
| private | bool | No | Whether the repository should be private. Defaults to True. |
Outputs
| Name | Type | Description |
|---|---|---|
| (side effects) | None | Uploads all files from output_dir to the HuggingFace Hub repository. Creates the repo and branch if they do not exist. Logs the URL of the uploaded model.
|
Usage Examples
Basic Usage
from open_instruct.model_utils import push_folder_to_hub
# Upload after training
push_folder_to_hub(
output_dir="output/my_sft_model",
hf_repo_id="allenai/my-tulu-model",
hf_repo_revision="sft-v1",
private=True,
)
# Logs: pushed to https://huggingface.co/allenai/my-tulu-model/tree/sft-v1
Integration in Training Loop
# At the end of training in finetune.py:
if accelerator.is_main_process:
if args.push_to_hub:
push_folder_to_hub(
output_dir=args.output_dir,
hf_repo_id=args.hf_repo_id,
hf_repo_revision=args.hf_repo_revision,
)
Dependencies
- huggingface_hub -- provides
HfApifor repository management and file upload