Implementation:Triton inference server Server Git LFS Clone
Metadata
| Field | Value |
|---|---|
| Type | Implementation |
| Workflow | LLM_Deployment_With_TRT_LLM |
| Repo | Triton_inference_server_Server |
| Source | docs/getting_started/llm.md:L81-86 |
| Domains | NLP, LLM_Deployment |
| Knowledge_Sources | TRT-LLM Docs|https://nvidia.github.io/TensorRT-LLM/, source::Repo|Triton Server|https://github.com/triton-inference-server/server |
| implements | Principle:Triton_inference_server_Server_Model_Weight_Download |
| 2026-02-13 17:00 GMT |
Overview
Concrete Git LFS procedure for downloading HuggingFace model weights. This implementation covers the exact commands to initialize Git LFS and clone a model repository with all large weight files.
Description
This implementation provides the step-by-step commands for downloading pre-trained model weights from HuggingFace Hub using Git LFS. The process ensures that all large binary files (safetensors, bin files) are fully downloaded rather than left as LFS pointer stubs.
The procedure consists of two commands:
git lfs install— Registers the Git LFS filter and hooks in the local Git configurationgit clone— Clones the HuggingFace model repository, automatically fetching LFS-tracked files
Usage
Run after TRT-LLM environment setup. Ensure sufficient disk space is available for the target model. The resulting directory is used as input for the checkpoint conversion step.
Code Reference
Source Location
| Item | Value |
|---|---|
| File | docs/getting_started/llm.md |
| Lines | L81-86 |
| Repo | https://github.com/triton-inference-server/server |
Signature
git lfs install
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
Import / Verification
# Verify all LFS files are downloaded (no pointer stubs)
cd Phi-3-mini-4k-instruct
git lfs ls-files --all
ls -lh *.safetensors
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
| Network access | System | Internet connectivity to reach huggingface.co |
| Disk space | System | Sufficient free disk space (~7.6 GB for Phi-3-mini-4k-instruct) |
| git-lfs | System package | Git LFS extension installed (from environment setup step) |
| git | System package | Git version control installed |
Outputs
| Name | Type | Description |
|---|---|---|
| Model directory | Directory | ./Phi-3-mini-4k-instruct/ containing all model files
|
| safetensors files | Binary files | Model weight files in safetensors format |
| tokenizer files | JSON files | tokenizer.json, tokenizer_config.json, special_tokens_map.json
|
| config.json | JSON file | Model architecture configuration (hidden_size, num_layers, etc.) |
| generation_config.json | JSON file | Default generation parameters (max_length, temperature, etc.) |
Usage Examples
Download Phi-3-mini-4k-instruct
# Initialize Git LFS
git lfs install
# Clone the model repository (downloads all weight files)
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
# Verify the download
ls -lh Phi-3-mini-4k-instruct/
Expected directory structure
Phi-3-mini-4k-instruct/
config.json
generation_config.json
model-00001-of-00002.safetensors
model-00002-of-00002.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json
Download alternative models
# For LLaMA-2-7B (requires HuggingFace access token)
git clone https://huggingface.co/meta-llama/Llama-2-7b-hf
# For GPT-2
git clone https://huggingface.co/gpt2
Related Pages
- Principle:Triton_inference_server_Server_Model_Weight_Download
- Implementation:Triton_inference_server_Server_Pip_Install_Tensorrt_LLM — Prerequisite environment setup
- Implementation:Triton_inference_server_Server_Convert_Checkpoint — Next step: convert weights to TRT-LLM format
- Environment:Triton_inference_server_Server_TRT_LLM_Deployment