Implementation:Huggingface Diffusers Tests Fetcher

Knowledge Sources	Huggingface_Diffusers
Domains	CI, Testing, Dependency_Analysis
Last Updated	2026-02-13 21:00 GMT

Overview

Concrete tool for selectively determining which tests to run on a pull request by analyzing the import dependency graph between modified files and test files in the diffusers repository.

Description

The tests_fetcher.py utility (V2) implements a two-stage approach to selective test execution. Stage 1 identifies modified files by computing the git diff between the PR branch and its base (or between the last two commits on main). It filters out changes that only affect docstrings or comments. Stage 2 builds a reverse dependency map by analyzing Python imports across all source and test files: if module A imports module B, then changing B means tests for A should run. The script recursively follows this dependency chain to produce a minimal test list. When too many pipelines are affected, it falls back to testing only "core" pipelines (ControlNet, Stable Diffusion, SDXL, SVD, etc.). It also supports commit message flags: `[skip ci]` to skip, `[test all]` to run everything, and `[no filter]` to disable pipeline filtering.

Usage

Run this script in CI to generate the selective test list for a PR. On the main branch, it automatically diffs against the last commit. The output is a text file listing test paths and a JSON map categorizing tests by type. It is the core mechanism that keeps CI fast by avoiding running all 400+ test files on every PR.

Code Reference

Source Location

Repository: Huggingface_Diffusers
File: utils/tests_fetcher.py
Lines: 1-1128

Signature

@contextmanager
def checkout_commit(repo: Repo, commit_id: str):
    """Context manager that checks out a given commit and restores on exit."""
    ...

def get_diff(repo: Repo, base_commit: str, commits: list[str]) -> list[str]:
    """Get the list of modified files between commits."""
    ...

def get_module_dependencies(module_fname: str) -> list[str]:
    """Get all modules imported by a given module file."""
    ...

def create_reverse_dependency_tree() -> dict[str, list[str]]:
    """Build a map from each module to all modules that depend on it."""
    ...

def infer_tests_to_run(
    output_file: str,
    diff_with_last_commit: bool = False,
    json_output_file: str | None = None,
):
    """Main function: identify modified files and compute impacted tests."""
    ...

def filter_tests(output_file: str, filters: list[str]):
    """Filter specific test categories from the output file."""
    ...

def parse_commit_message(commit_message: str) -> dict:
    """Parse commit flags: [skip ci], [test all], [no filter]."""
    ...

def print_tree_deps_of(module_fname: str):
    """Print the dependency tree for a specific module (debug tool)."""
    ...

def get_all_tests() -> list[str]:
    """Get all test files in the repository."""
    ...

def create_json_map(test_files: list[str], json_output_file: str):
    """Create a JSON mapping of test categories to test files."""
    ...

def update_test_map_with_core_pipelines(json_output_file: str):
    """Ensure core pipeline tests are always included."""
    ...

Import

# CLI script — not imported as a module:
# python utils/tests_fetcher.py
# python utils/tests_fetcher.py --diff_with_last_commit
# python utils/tests_fetcher.py --print_dependencies_of src/diffusers/models/unet_2d.py

I/O Contract

Inputs

Name	Type	Required	Description
output_file	str	No	Path to write the test list (default: `test_list.txt`)
json_output_file	str	No	Path to write the test map JSON (default: `test_map.json`)
diff_with_last_commit	bool	No	Diff against last commit instead of PR base
filter_tests	bool	No	Filter pipeline/repo_utils tests from the list
print_dependencies_of	str	No	Print dependency tree for a specific file (debug mode)
commit_message	str	No	Commit message to parse for CI flags

Outputs

Name	Type	Description
test_list.txt	File	Newline-separated list of test file paths to run
test_map.json	File	JSON dict mapping test categories to lists of test files
examples_test_list.txt	File	List of example tests to run (when `[test all]` is set)

Usage Examples

PR Test Selection

# Standard PR usage — detects branch and computes diff automatically
python utils/tests_fetcher.py

# Output files:
# test_list.txt — flat list of tests
# test_map.json — categorized test map

Main Branch Usage

# On main branch, diff against last commit
python utils/tests_fetcher.py --diff_with_last_commit

Debug Dependency Tree

# Show which tests depend on a specific module
python utils/tests_fetcher.py --print_dependencies_of src/diffusers/models/unets/unet_2d.py

Related Pages

Environment:Huggingface_Diffusers_PyTorch_CUDA_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment