Overview
Concrete tool for validating file existence and finding files matching patterns in local directories or Hugging Face Hub repositories provided by the Huggingface Optimum library.
Description
This module provides two utility functions for working with model files:
- validate_file_exists — Checks whether a specific file exists in a local directory or a Hugging Face Hub repository. For local paths, it uses `os.path.isfile`; for Hub repos, it queries the HfApi.
- find_files_matching_pattern — Scans a local directory (via `Path.glob`) or a Hub repository (via `HfApi.list_repo_files`) for files matching a regex pattern. Returns a list of `Path` objects.
Both functions transparently handle the local vs. Hub distinction based on whether the provided path is a local directory.
Usage
Use validate_file_exists to check for the presence of specific model files (e.g., ONNX weights, config files) before loading. Use find_files_matching_pattern to discover all files of a certain type (e.g., all `.onnx` files) in a model directory or repo.
Code Reference
Source Location
Signature
def validate_file_exists(
model_name_or_path: Union[str, Path],
filename: str,
subfolder: str = "",
revision: Optional[str] = None,
token: Optional[Union[bool, str]] = None,
) -> bool:
"""Check that filename exists in model_name_or_path directory or repo."""
def find_files_matching_pattern(
model_name_or_path: Union[str, Path],
pattern: str,
glob_pattern: str = "**/*",
subfolder: str = "",
token: Optional[Union[bool, str]] = None,
revision: Optional[str] = None,
) -> List[Path]:
"""Scan a model repo or local directory for files matching a regex pattern."""
Import
from optimum.utils.file_utils import validate_file_exists, find_files_matching_pattern
I/O Contract
Inputs (validate_file_exists)
| Name |
Type |
Required |
Description
|
| model_name_or_path |
Union[str, Path] |
Yes |
Local path or Hub repo ID
|
| filename |
str |
Yes |
Name of the file to check
|
| subfolder |
str |
No |
Subfolder within the model directory (default: "")
|
| revision |
str |
No |
Specific model version (branch, tag, commit)
|
| token |
Union[bool, str] |
No |
Authentication token for Hub access
|
Outputs (validate_file_exists)
| Name |
Type |
Description
|
| exists |
bool |
True if the file exists, False otherwise
|
Inputs (find_files_matching_pattern)
| Name |
Type |
Required |
Description
|
| model_name_or_path |
Union[str, Path] |
Yes |
Local path or Hub repo ID
|
| pattern |
str |
Yes |
Regex pattern to match filenames
|
| glob_pattern |
str |
No |
Glob pattern for initial file listing (default: "**/*")
|
| subfolder |
str |
No |
Subfolder prefix for pattern matching
|
| token |
Union[bool, str] |
No |
Authentication token for Hub access
|
| revision |
str |
No |
Specific model version
|
Outputs (find_files_matching_pattern)
| Name |
Type |
Description
|
| files |
List[Path] |
List of Path objects for matching files
|
Usage Examples
Checking for a Model File
from optimum.utils.file_utils import validate_file_exists
# Check local directory
exists = validate_file_exists("./my_model", "model.onnx")
# Check Hub repository
exists = validate_file_exists("bert-base-uncased", "config.json")
Finding ONNX Files
from optimum.utils.file_utils import find_files_matching_pattern
# Find all .onnx files in a local directory
onnx_files = find_files_matching_pattern("./my_model", r".*\.onnx$")
# Find all .onnx files in a Hub repository
onnx_files = find_files_matching_pattern(
"optimum/distilbert-base-uncased-finetuned-sst-2-english",
r".*\.onnx$",
token=True,
)
for f in onnx_files:
print(f)
Related Pages