Implementation:Mlflow Mlflow Set Matrix
| Knowledge Sources | |
|---|---|
| Domains | CI/CD, Testing, Build Infrastructure |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Generates the GitHub Actions test matrix for cross-version compatibility tests across all MLflow ML framework integrations.
Description
set_matrix.py is a core CI infrastructure script that determines which ML framework version combinations to test in MLflow's cross-version test suite. It reads the ml-package-versions.yml configuration file, fetches available package versions from PyPI, filters them by supported ranges and unsupported versions, infers the appropriate Python and Java versions, and outputs a JSON matrix suitable for GitHub Actions workflows.
The script uses Pydantic models to validate the configuration structure, including PackageInfo (pip release name, dev install commands, repo URL), TestConfig (minimum/maximum versions, unsupported specifiers, requirements, run commands), and FlavorConfig (combining package info with models and autologging test configs). A custom Version class extends packaging.version.Version to handle dev versions by treating them as a very large numeric version (9999.9999.9999).
Key capabilities include:
- Fetching released versions from the PyPI JSON API with release date tracking
- Filtering versions by min/max ranges, unsupported specifiers, and release recency
- Detecting changed flavors from Git file diffs to run only affected tests
- Generating tracing SDK test variants for autologging categories
- Splitting the matrix across multiple GitHub Actions jobs to stay within the 256-job limit
- Validating test coverage to ensure all test files are executed
Usage
Use this script as part of CI workflows to generate the cross-version test matrix, or locally to test specific flavors or versions during development.
Code Reference
Source Location
- Repository: Mlflow_Mlflow
- File: dev/set_matrix.py
- Lines: 1-819
Signature
class Version(OriginalVersion):
def __init__(self, version: str, release_date: datetime | None = None): ...
class PackageInfo(BaseModel):
pip_release: str
install_dev: str | None = None
module_name: str | None = None
genai: bool = False
repo: str | None = None
class TestConfig(BaseModel):
minimum: Version
maximum: Version
unsupported: list[SpecifierSet] | None = None
requirements: dict[str, list[str]] | None = None
python: dict[str, str] | None = None
runs_on: dict[str, str] | None = None
java: dict[str, str] | None = None
run: str
allow_unreleased_max_version: bool | None = None
pre_test: str | None = None
test_every_n_versions: int = 1
test_tracing_sdk: bool = False
class FlavorConfig(BaseModel):
package_info: PackageInfo
models: TestConfig | None = None
autologging: TestConfig | None = None
class MatrixItem(BaseModel):
name: str
flavor: str
category: str
job_name: str
install: str
run: str
package: str
version: Version
python: str
java: str
supported: bool
free_disk_space: bool
runs_on: str
pre_test: str | None = None
def generate_matrix(args) -> set[MatrixItem]: ...
def main(args): ...
Import
# Run directly as a script
python dev/set_matrix.py
# Test all items
python dev/set_matrix.py
# Test a specific flavor
python dev/set_matrix.py --flavors sklearn
# Exclude dev versions
python dev/set_matrix.py --no-dev
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --versions-yaml | str | No | URL or local path to the config YAML (default: mlflow/ml-package-versions.yml) |
| --ref-versions-yaml | str | No | Reference config YAML for diffing to identify updates |
| --changed-files | str | No | Newline-separated list of changed files for selective testing |
| --flavors | str | No | Comma-separated flavors to test (e.g. "sklearn, xgboost") |
| --versions | str | No | Comma-separated versions to test (e.g. "1.2.3, 4.5.6") |
| --no-dev | flag | No | Exclude dev versions from the matrix |
| --only-latest | flag | No | Only test the latest version per group |
Outputs
| Name | Type | Description |
|---|---|---|
| matrix1 | JSON (GitHub Actions output) | First chunk of the test matrix with include array and job_name list |
| matrix2 | JSON (GitHub Actions output) | Second chunk of the test matrix |
| is_matrix1_empty | str | "true" or "false" indicating if matrix1 has items |
| is_matrix2_empty | str | "true" or "false" indicating if matrix2 has items |
Usage Examples
Basic Usage
# Generate full test matrix
python dev/set_matrix.py
# Test only sklearn flavor
python dev/set_matrix.py --flavors sklearn
# Test affected flavors from changed files
python dev/set_matrix.py --changed-files "mlflow/sklearn/__init__.py"
# Compare against upstream config to test only changed versions
python dev/set_matrix.py --ref-versions-yaml \
"https://raw.githubusercontent.com/mlflow/mlflow/master/ml-package-versions.yml"
# Test specific version only
python dev/set_matrix.py --versions 1.1.1