Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Mlflow Mlflow Package Build and Configuration

From Leeroopedia
Knowledge Sources
Domains Build Systems, Package Management
Last Updated 2026-02-13 20:00 GMT

Overview

Programmatic generation of package configuration files for multiple distribution variants from a shared dependency specification, with build orchestration for producing distributable artifacts.

Description

When a project ships multiple package variants (a full package, a lightweight "skinny" package, a focused SDK package, separate development and release configurations), maintaining separate hand-written configuration files for each variant is error-prone and leads to drift. This principle addresses the problem through programmatic generation.

The Configuration Generator reads dependency specifications from structured YAML files, each defining packages with their version constraints (minimum version, maximum major version, unsupported versions, environment markers, extras). It also reads the project version from the canonical version file and the minimum Python version from a configuration file. From these inputs, it generates complete package configuration files for each variant:

  • Development variant: Includes all core and lightweight dependencies directly, omitting references to sub-packages (since they are installed from source during development).
  • Release variant: Declares the lightweight and SDK sub-packages as exact-version dependencies, plus core dependencies, ensuring version lock-step across all packages.
  • Skinny variant: Contains only the minimal dependency set for headless operation without storage, serving, or data science libraries.
  • Tracing SDK variant: Contains the minimum dependencies for instrumentation, with explicit file inclusion/exclusion lists to minimize package size.

A validation step ensures that the tracing dependency set is a strict subset of the skinny dependency set, preventing version conflicts when both packages are installed together. Duplicate dependencies within any variant are detected and rejected. Generated configuration files are formatted with a TOML formatter and written only if their content has changed, avoiding unnecessary churn in version control.

The Build Orchestrator handles the mechanical process of producing distributable wheel and sdist artifacts. It initializes git submodules, cleans previous build artifacts across all package variants, selects the appropriate variant configuration, and invokes the build backend. For the release variant, it swaps in the release-specific configuration file before building. For sub-packages built from subdirectories, it relocates the resulting artifacts to a central distribution directory. An optional SHA tagging mechanism embeds the git commit hash into the wheel filename as a build tag.

The Project Configuration itself (pyproject.toml) has a dual structure: a generated section containing package metadata that must stay in sync with other variants, followed by a manually maintained section containing development tool settings (linter configuration, formatter settings, test configuration). The generator preserves the manual section when regenerating the file.

Usage

This principle applies when dependencies change, new package variants are added, versions are bumped, or release artifacts need to be produced. The generator should be re-run after modifying any YAML requirement file or the version file to ensure all configuration variants remain consistent.

Theoretical Basis

The configuration generation follows a read-transform-validate-write pipeline:

1. Read inputs:
   - Dependency YAML files (skinny, tracing, core, gateway, genai requirements)
   - Project version from version.py
   - Minimum Python version from .python-version
   - ML framework version bounds from the versions YAML

2. Transform per variant:
   - Select the appropriate dependency set for each variant
   - Assemble the full configuration structure (build system, project metadata,
     dependencies, optional dependencies, entry points, package discovery)
   - Apply variant-specific rules (e.g., tracing has no entry points or extras)

3. Validate:
   - Assert tracing requirements are a subset of skinny requirements
   - Assert no duplicate dependencies within any variant
   - Assert version string is found in the canonical location

4. Write:
   - Serialize to TOML format
   - Apply consistent formatting via a TOML formatter
   - Compare with existing file content; write only if changed
   - For the development variant, preserve the manual tool settings section

The build orchestration follows a clean-configure-build-relocate sequence:

1. Initialize git submodules (for vendored content)
2. Clean all previous build artifacts across all package paths
3. For release builds: swap in the release configuration file
4. Invoke the Python build backend for the selected package variant
5. For sub-packages: move dist artifacts to the central dist/ directory
6. If SHA tagging requested: rename wheel file to include the build tag
7. Restore any swapped configuration files

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment