Implementation:Triton inference server Server Git LFS Clone

Metadata

Field	Value
Type	Implementation
Workflow	LLM_Deployment_With_TRT_LLM
Repo	Triton_inference_server_Server
Source	docs/getting_started/llm.md:L81-86
Domains	NLP, LLM_Deployment
Knowledge_Sources	TRT-LLM Docs\|https://nvidia.github.io/TensorRT-LLM/, source::Repo\|Triton Server\|https://github.com/triton-inference-server/server
implements	Principle:Triton_inference_server_Server_Model_Weight_Download
2026-02-13 17:00 GMT

Overview

Concrete Git LFS procedure for downloading HuggingFace model weights. This implementation covers the exact commands to initialize Git LFS and clone a model repository with all large weight files.

Description

This implementation provides the step-by-step commands for downloading pre-trained model weights from HuggingFace Hub using Git LFS. The process ensures that all large binary files (safetensors, bin files) are fully downloaded rather than left as LFS pointer stubs.

The procedure consists of two commands:

git lfs install — Registers the Git LFS filter and hooks in the local Git configuration
git clone — Clones the HuggingFace model repository, automatically fetching LFS-tracked files

Usage

Run after TRT-LLM environment setup. Ensure sufficient disk space is available for the target model. The resulting directory is used as input for the checkpoint conversion step.

Code Reference

Source Location

Item	Value
File	docs/getting_started/llm.md
Lines	L81-86
Repo	https://github.com/triton-inference-server/server

Signature

git lfs install
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

Import / Verification

# Verify all LFS files are downloaded (no pointer stubs)
cd Phi-3-mini-4k-instruct
git lfs ls-files --all
ls -lh *.safetensors

I/O Contract

Inputs

Name	Type	Description
Network access	System	Internet connectivity to reach huggingface.co
Disk space	System	Sufficient free disk space (~7.6 GB for Phi-3-mini-4k-instruct)
git-lfs	System package	Git LFS extension installed (from environment setup step)
git	System package	Git version control installed

Outputs

Name	Type	Description
Model directory	Directory	`./Phi-3-mini-4k-instruct/` containing all model files
safetensors files	Binary files	Model weight files in safetensors format
tokenizer files	JSON files	`tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`
config.json	JSON file	Model architecture configuration (hidden_size, num_layers, etc.)
generation_config.json	JSON file	Default generation parameters (max_length, temperature, etc.)

Usage Examples

Download Phi-3-mini-4k-instruct

# Initialize Git LFS
git lfs install

# Clone the model repository (downloads all weight files)
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

# Verify the download
ls -lh Phi-3-mini-4k-instruct/

Expected directory structure

Phi-3-mini-4k-instruct/
  config.json
  generation_config.json
  model-00001-of-00002.safetensors
  model-00002-of-00002.safetensors
  model.safetensors.index.json
  special_tokens_map.json
  tokenizer.json
  tokenizer_config.json

Download alternative models

# For LLaMA-2-7B (requires HuggingFace access token)
git clone https://huggingface.co/meta-llama/Llama-2-7b-hf

# For GPT-2
git clone https://huggingface.co/gpt2

Related Pages

Principle:Triton_inference_server_Server_Model_Weight_Download
Implementation:Triton_inference_server_Server_Pip_Install_Tensorrt_LLM — Prerequisite environment setup
Implementation:Triton_inference_server_Server_Convert_Checkpoint — Next step: convert weights to TRT-LLM format
Environment:Triton_inference_server_Server_TRT_LLM_Deployment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment