Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Diffusers Tests Fetcher

From Leeroopedia
Knowledge Sources
Domains CI, Testing, Dependency_Analysis
Last Updated 2026-02-13 21:00 GMT

Overview

Concrete tool for selectively determining which tests to run on a pull request by analyzing the import dependency graph between modified files and test files in the diffusers repository.

Description

The tests_fetcher.py utility (V2) implements a two-stage approach to selective test execution. Stage 1 identifies modified files by computing the git diff between the PR branch and its base (or between the last two commits on main). It filters out changes that only affect docstrings or comments. Stage 2 builds a reverse dependency map by analyzing Python imports across all source and test files: if module A imports module B, then changing B means tests for A should run. The script recursively follows this dependency chain to produce a minimal test list. When too many pipelines are affected, it falls back to testing only "core" pipelines (ControlNet, Stable Diffusion, SDXL, SVD, etc.). It also supports commit message flags: `[skip ci]` to skip, `[test all]` to run everything, and `[no filter]` to disable pipeline filtering.

Usage

Run this script in CI to generate the selective test list for a PR. On the main branch, it automatically diffs against the last commit. The output is a text file listing test paths and a JSON map categorizing tests by type. It is the core mechanism that keeps CI fast by avoiding running all 400+ test files on every PR.

Code Reference

Source Location

Signature

@contextmanager
def checkout_commit(repo: Repo, commit_id: str):
    """Context manager that checks out a given commit and restores on exit."""
    ...

def get_diff(repo: Repo, base_commit: str, commits: list[str]) -> list[str]:
    """Get the list of modified files between commits."""
    ...

def get_module_dependencies(module_fname: str) -> list[str]:
    """Get all modules imported by a given module file."""
    ...

def create_reverse_dependency_tree() -> dict[str, list[str]]:
    """Build a map from each module to all modules that depend on it."""
    ...

def infer_tests_to_run(
    output_file: str,
    diff_with_last_commit: bool = False,
    json_output_file: str | None = None,
):
    """Main function: identify modified files and compute impacted tests."""
    ...

def filter_tests(output_file: str, filters: list[str]):
    """Filter specific test categories from the output file."""
    ...

def parse_commit_message(commit_message: str) -> dict:
    """Parse commit flags: [skip ci], [test all], [no filter]."""
    ...

def print_tree_deps_of(module_fname: str):
    """Print the dependency tree for a specific module (debug tool)."""
    ...

def get_all_tests() -> list[str]:
    """Get all test files in the repository."""
    ...

def create_json_map(test_files: list[str], json_output_file: str):
    """Create a JSON mapping of test categories to test files."""
    ...

def update_test_map_with_core_pipelines(json_output_file: str):
    """Ensure core pipeline tests are always included."""
    ...

Import

# CLI script — not imported as a module:
# python utils/tests_fetcher.py
# python utils/tests_fetcher.py --diff_with_last_commit
# python utils/tests_fetcher.py --print_dependencies_of src/diffusers/models/unet_2d.py

I/O Contract

Inputs

Name Type Required Description
output_file str No Path to write the test list (default: `test_list.txt`)
json_output_file str No Path to write the test map JSON (default: `test_map.json`)
diff_with_last_commit bool No Diff against last commit instead of PR base
filter_tests bool No Filter pipeline/repo_utils tests from the list
print_dependencies_of str No Print dependency tree for a specific file (debug mode)
commit_message str No Commit message to parse for CI flags

Outputs

Name Type Description
test_list.txt File Newline-separated list of test file paths to run
test_map.json File JSON dict mapping test categories to lists of test files
examples_test_list.txt File List of example tests to run (when `[test all]` is set)

Usage Examples

PR Test Selection

# Standard PR usage — detects branch and computes diff automatically
python utils/tests_fetcher.py

# Output files:
# test_list.txt — flat list of tests
# test_map.json — categorized test map

Main Branch Usage

# On main branch, diff against last commit
python utils/tests_fetcher.py --diff_with_last_commit

Debug Dependency Tree

# Show which tests depend on a specific module
python utils/tests_fetcher.py --print_dependencies_of src/diffusers/models/unets/unet_2d.py

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment