Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Transformers Benchmarks Entrypoint

From Leeroopedia
Knowledge Sources
Domains Benchmarking, Performance_Testing
Last Updated 2026-02-13 20:00 GMT

Overview

Concrete tool for discovering and executing benchmark modules with metrics recording to PostgreSQL and CSV outputs.

Description

The benchmarks_entrypoint.py module serves as the central CLI entrypoint for the v1 benchmark system. The MetricsRecorder class wraps a psycopg2 database connection and pandas DataFrames to store benchmark metadata, device measurements (CPU/GPU utilization, memory), and model measurements (load time, forward pass times, token generation times). It supports dual-mode storage (database and CSV) and produces summary CSVs with aggregated device statistics. The main block parses CLI arguments (repository, branch, commit info, CSV flags), creates a global MetricsRecorder, then scans a benches/ folder for Python files containing a run_benchmark function. Each discovered module is dynamically imported via importlib and executed.

Usage

Use this entrypoint to run all benchmark modules registered in the benches/ directory. It is the standard entry point for CI-driven benchmark execution with automatic metrics persistence.

Code Reference

Source Location

Signature

class MetricsRecorder:
    def __init__(
        self,
        connection=None,
        branch: str = "",
        commit_id: str = "",
        commit_msg: str = "",
        ci_run_url: str = "",
    ):
        """Initialize metrics recorder with database connection and run metadata."""

    def store_model_benchmark_metrics(
        self,
        model_name: str,
        metrics: Dict[str, Any],
    ) -> None:
        """Store model benchmark measurements."""

    def store_device_benchmark_metrics(
        self,
        device_metrics: Dict[str, Any],
    ) -> None:
        """Store device utilization measurements."""

Import

python benchmark/benchmarks_entrypoint.py --repo huggingface/transformers \
    --branch main --commit_id abc123 --csv

I/O Contract

Inputs

Name Type Required Description
--repo str Yes Repository name
--branch str Yes Git branch name
--commit_id str Yes Git commit SHA
--commit_msg str No Commit message
--csv flag No Enable CSV output mode

Outputs

Name Type Description
PostgreSQL records DB rows Benchmark metrics stored in database tables
*.csv CSV files Device and model metrics if --csv flag is set

Usage Examples

Running All Benchmarks

# Run benchmarks with CSV output
python benchmark/benchmarks_entrypoint.py \
    --repo huggingface/transformers \
    --branch main \
    --commit_id abc123 \
    --commit_msg "Fix attention" \
    --csv

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment