Implementation:Mlflow Mlflow Generate Protobuf Code
| Knowledge Sources | |
|---|---|
| Domains | CodeGeneration, Build |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Compiles MLflow protobuf definitions into Python, Java, and type stub files with dual protobuf version support (3.19.4 and 26.0).
Description
This script is the core code generation pipeline that produces all protobuf-derived source files used throughout the MLflow codebase. It performs the following steps:
- Downloads two protoc compiler versions (3.19.4 and 26.0) from GitHub releases, caching them in .cache/protobuf_cache/.
- Generates Python code from both protoc versions for all proto files (basic, Unity Catalog, tracing, facet) plus test protos.
- Merges the two generated outputs into version-branched Python files that select the correct code at runtime based on the installed google.protobuf version (major >= 5 uses the 26.0 output, otherwise uses 3.19.4).
- Applies import path replacements to convert absolute imports to relative imports for proper package-internal use.
- Generates Java code using the pinned protoc 3.19.4 version to mlflow/java/client/src/main/java.
- Generates .pyi type stub files using protoc 26.0 for IDE support.
- Generates documentation using the custom proto plugin for JSON doc output.
The script manages several proto file sets: basic_proto_files (service definitions), uc_proto_files (Unity Catalog), tracing_proto_files, facet_proto_files, and test_proto_files. It includes OpenTelemetry proto definitions from mlflow/protos/opentelemetry/ as an include path.
Usage
Run this script when protobuf .proto files are modified to regenerate all derived source files. It requires a Linux environment (x86_64 or aarch64).
Code Reference
Source Location
- Repository: Mlflow_Mlflow
- File: dev/generate_protos.py
- Lines: 1-316
Signature
def gen_protos(
proto_dir: Path, proto_files: list[Path], lang: Literal["python", "java"],
protoc_bin: Path, protoc_include_paths: list[Path], out_dir: Path,
) -> None: ...
def gen_stub_files(
proto_dir: Path, proto_files: list[Path], protoc_bin: Path,
protoc_include_paths: list[Path], out_dir: Path,
) -> None: ...
def gen_proto_docs(
proto_dir: Path, proto_files: list[Path], protoc_bin: Path,
protoc_include_path: Path, out_dir: Path,
) -> None: ...
def apply_python_gencode_replacement(file_path: Path) -> None: ...
def gen_python_protos(protoc_bin: Path, protoc_include_paths: list[Path], out_dir: Path) -> None: ...
def download_and_extract_protoc(version: Literal["3.19.4", "26.0"]) -> tuple[Path, Path]: ...
def generate_final_python_gencode(
gencode3194_path: Path, gencode5260_path: Path, out_path: Path
) -> None: ...
def main() -> None: ...
Import
# Run from repository root (Linux only)
python dev/generate_protos.py
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| mlflow/protos/*.proto | Proto files | Yes | MLflow protobuf service and message definitions |
| tests/protos/*.proto | Proto files | Yes | Test protobuf definitions |
| mlflow/protos/opentelemetry/ | Proto directory | Yes | OpenTelemetry proto include files |
Outputs
| Name | Type | Description |
|---|---|---|
| mlflow/protos/*_pb2.py | Python files | Generated Python protobuf code with dual-version branching |
| mlflow/protos/*.pyi | Stub files | Python type stub files for IDE support |
| mlflow/protos/protos.json | JSON file | Proto documentation JSON for REST API doc generation |
| mlflow/java/client/src/main/java/ | Java files | Generated Java protobuf code |
| tests/protos/*_pb2.py | Python files | Generated Python protobuf code for tests |
Usage Examples
Basic Usage
# Generate all protobuf code (must be on Linux)
python dev/generate_protos.py