Overview
Concrete tool for generating a list of benchmark configurations at varying levels of thoroughness, provided by the HuggingFace Transformers benchmark framework.
Description
get_config_by_level is a factory function that returns a list of BenchmarkConfig objects corresponding to a specified benchmark thoroughness level (0-4). Lower levels produce a small set of curated, commonly used configurations for quick validation. Higher levels produce the full Cartesian product of all attention implementations, compile modes, kernelization options, and batching modes for exhaustive benchmarking. The companion function adapt_configs takes an existing list of configurations and expands it across multiple values of workload dimensions (batch size, sequence length, tokens to generate, iteration counts, and GPU monitoring) using a Cartesian product, enabling workload-shape sweeps.
Usage
Use get_config_by_level to generate a standard set of benchmark configurations for a model evaluation. Use adapt_configs to expand those configurations across multiple input dimensions for scaling analysis. Together, they produce the complete configuration matrix for a benchmark sweep.
Code Reference
Source Location
- Repository: transformers
- File:
benchmark_v2/framework/benchmark_config.py (lines 200-270)
Signature
def get_config_by_level(level: int) -> list[BenchmarkConfig]:
...
def adapt_configs(
configs: list[BenchmarkConfig],
warmup_iterations: int | list[int] = 5,
measurement_iterations: int | list[int] = 20,
batch_size: int | list[int] = 1,
sequence_length: int | list[int] = 128,
num_tokens_to_generate: int | list[int] = 128,
gpu_monitoring: bool | list[bool] = True,
) -> list[BenchmarkConfig]:
...
Import
from benchmark_v2.framework.benchmark_config import get_config_by_level, adapt_configs
I/O Contract
Inputs (get_config_by_level)
| Name |
Type |
Required |
Description
|
| level |
int |
Yes |
Thoroughness level (0-4). Level 0 produces 1 config; Level 1 adds 4 more; Level 2 adds 4 more; Levels 3-4 produce the full Cartesian product.
|
Outputs (get_config_by_level)
| Name |
Type |
Description
|
| configs |
list[BenchmarkConfig] |
A list of validated BenchmarkConfig objects for the given level.
|
Inputs (adapt_configs)
| Name |
Type |
Required |
Description
|
| configs |
list[BenchmarkConfig] |
Yes |
Base configurations to expand.
|
| warmup_iterations |
list[int] |
No (default: 5) |
Warmup iteration count(s) to sweep.
|
| measurement_iterations |
list[int] |
No (default: 20) |
Measurement iteration count(s) to sweep.
|
| batch_size |
list[int] |
No (default: 1) |
Batch size(s) to sweep.
|
| sequence_length |
list[int] |
No (default: 128) |
Sequence length(s) to sweep.
|
| num_tokens_to_generate |
list[int] |
No (default: 128) |
Token generation count(s) to sweep.
|
| gpu_monitoring |
list[bool] |
No (default: True) |
GPU monitoring flag(s) to sweep.
|
Outputs (adapt_configs)
| Name |
Type |
Description
|
| adapted_configs |
list[BenchmarkConfig] |
Expanded list of configurations with all parameter combinations applied.
|
Level Details
| Level |
Configurations Generated |
Description
|
| 0 |
1 |
Flex Attention with default compilation. Smoke test.
|
| 1 |
5 (cumulative) |
Adds Flash Attention 2, eager+compiled, Flash Attention 2+continuous batching.
|
| 2 |
9 (cumulative) |
Adds SDPA+compiled, kernelized variants, SDPA+continuous batching.
|
| 3 |
Full product (2 compile modes) |
All attention x {None, default} compile x kernelization x continuous batching.
|
| 4 |
Full product (5 compile modes) |
All attention x all 5 compile modes x kernelization x continuous batching.
|
Usage Examples
Basic Usage
from benchmark_v2.framework.benchmark_config import get_config_by_level, adapt_configs
# Generate a quick smoke-test configuration set
configs = get_config_by_level(level=0)
print(len(configs)) # 1
# Generate a production-grade configuration set
configs = get_config_by_level(level=3)
print(len(configs)) # Full Cartesian product
Sweeping Across Input Dimensions
# Start with level-1 configs, then sweep batch sizes and sequence lengths
base_configs = get_config_by_level(level=1)
expanded_configs = adapt_configs(
base_configs,
warmup_iterations=5,
measurement_iterations=20,
batch_size=[1, 4, 8],
sequence_length=[128, 512, 1024],
num_tokens_to_generate=128,
)
print(len(expanded_configs)) # 5 base configs x 3 batch sizes x 3 seq lengths = 45
Related Pages
Implements Principle