Implementation:Mlflow Mlflow Update Changelog
| Knowledge Sources | |
|---|---|
| Domains | Release Automation, Documentation |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
A development script that automatically generates the CHANGELOG.md entry for a new MLflow release by extracting pull request metadata from git history and the GitHub GraphQL API.
Description
update_changelog.py is part of the MLflow release automation workflow. It performs the following steps:
- Determines the commits between the previous release tag and the current release branch using git log with cherry-pick filtering.
- Extracts PR numbers from commit messages matching the pattern (#NNNN).
- Batch-fetches PR metadata (title, author, labels) using the GitHub GraphQL API, processing PRs in chunks of 50 to stay within query size limits.
- Categorizes PRs by their rn/ (release note) labels into sections: breaking changes, highlights, features, bug fixes, documentation updates, and small updates.
- Generates a formatted changelog entry with a version header, date, and categorized sections, then prepends it to CHANGELOG.md.
The script uses two key data classes: PullRequest (a NamedTuple) which provides formatted string representations including area labels and PR links, and Section (a NamedTuple) which formats categorized PR lists. The script validates that all PRs have release note labels and that no unknown labels are present, failing with an assertion error if categorization is incomplete.
Usage
Use this script when preparing a new MLflow release to generate the changelog entry. Requires the GH_TOKEN environment variable for GitHub API authentication.
Code Reference
Source Location
- Repository: Mlflow_Mlflow
- File: dev/update_changelog.py
- Lines: 1-281
Signature
class PullRequest(NamedTuple):
title: str
number: int
author: str
labels: list[str]
@property
def url(self) -> str: ...
@property
def release_note_labels(self) -> list[str]: ...
class Section(NamedTuple):
title: str
items: list[Any]
def get_header_for_version(version: str) -> str: ...
def extract_pr_num_from_git_log_entry(git_log_entry: str) -> int | None: ...
def format_label(label: str) -> str: ...
def is_shallow() -> bool: ...
def batch_fetch_prs_graphql(pr_numbers: list[int]) -> list[PullRequest]: ...
def main(prev_version: str, release_version: str, remote: str) -> None: ...
Import
# Run as a standalone script
python dev/update_changelog.py --prev-version 2.18.0 --release-version 2.19.0
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --prev-version | str | Yes | The previous MLflow release version (e.g., "2.18.0") |
| --release-version | str | Yes | The MLflow version being released (e.g., "2.19.0") |
| --remote | str | No | Git remote to use (default: "origin") |
Outputs
| Name | Type | Description |
|---|---|---|
| CHANGELOG.md | file | Updated changelog file with new version entry prepended |
| stdout | text | Progress messages including batch fetch status |
Usage Examples
Basic Usage
# Generate changelog for version 2.19.0
export GH_TOKEN="your-github-token"
python dev/update_changelog.py --prev-version 2.18.0 --release-version 2.19.0
Custom Remote
# Use a different git remote
python dev/update_changelog.py --prev-version 2.18.0 --release-version 2.19.0 --remote upstream
Release Note Label Mapping
The script recognizes the following rn/ labels for categorizing pull requests:
| Label | Section |
|---|---|
| rn/breaking-change | Breaking changes |
| rn/highlight | Major new features |
| rn/feature | Features |
| rn/bug-fix | Bug fixes |
| rn/documentation | Documentation updates |
| rn/none | Small bug fixes and documentation updates (grouped by author) |
PRs without any rn/ label cause the script to fail with an assertion error listing the uncategorized PR URLs.