Implementation:MaterializeInc Materialize Annotate Logged Errors
| Knowledge Sources | Materialize CI error annotation system, GitHub Issues API, Buildkite annotations API, JUnit XML parsing |
|---|---|
| Domains | Continuous Integration, Error Classification, Issue Tracking, Test Analytics, Python |
| Last Updated | 2026-02-08 |
Overview
Concrete Python functions for automated CI error triage provided by Materialize's ci_annotate_errors.py, which scans test log files and JUnit XML reports, matches errors against known GitHub issues via regex, classifies them as known or unknown, and annotates Buildkite builds with structured error information.
Description
The annotate_logged_errors() function is the core error classification engine. It processes test output from a completed CI step and produces structured annotations. The companion main() function serves as the CLI entry point.
main() function:
- Parses CLI arguments:
--cloud-hostname,--test-cmd,--test-desc,--test-result, and positionallog_files. - Initializes a
TestAnalyticsDbconnection for recording build metrics. - Records the build job (successful or not) in test analytics.
- Calls
annotate_logged_errors()to classify errors. - Submits analytics updates, handling upload failures gracefully (never failing the build due to analytics errors).
- Determines the final exit code:
- Returns
0if all errors are known issues withci-ignore-failure: true. - Returns
1if the test passed but unknown errors were found in logs. - Otherwise returns the original
test_result.
- Returns
annotate_logged_errors() function:
- Asynchronous artifact fetching: Submits a background task to fetch Buildkite artifacts for URL generation.
- Error collection: Calls
get_errors()to scan log files (regex matching) and parse JUnit XML reports, producing a list ofErrorLog,JunitError, andSecretobjects. - GitHub issue fetching: Retrieves all open GitHub issues labeled with CI error patterns via
get_known_issues_from_github(). Each issue contains a compiled regex and metadata (state, apply_to filter, location filter, ignore_failure flag). - Error classification: For each error, the inner
handle_error()closure:- Searches open issues for a regex match. If found and filters pass (step key, location), classifies as known issue.
- Searches closed issues for a regex match. If found, classifies as potential regression.
- If no match, classifies as unknown error.
- Deduplicates by issue number, reporting each issue at most once.
- Error source processing: Iterates over collected errors:
ErrorLog: Resolves artifact URLs from Buildkite, passes match text tohandle_error().JunitError: Extracts message, error details, and optional collapsed details (separated by a special delimiter). Coverage-related failures are classified separately asFailureInCoverageRun.Secret: Formats a warning message with the detector name and passes tohandle_error().
- Ignore-failure determination: Sets
ignore_failure=Trueonly if there are known errors, no unknown errors, and all known errors haveissue_ignore_failure=True. - Annotation: Calls
annotate_errors()to post a structured Buildkite annotation with all classified errors and main branch failure history. - Fallback annotation: If no errors were found but the test failed (and the build is not on main and not canceled), posts a generic failure annotation with main branch status.
- Analytics storage: Records known issues in the test analytics database.
- Returns
(number_of_unknown_errors, ignore_failure).
Usage
These functions are invoked automatically at the end of each CI test step via Materialize's CI plugins. The main() function is the standard entry point, called as python -m materialize.cli.ci_annotate_errors <log_files>.
Code Reference
Source Location
misc/python/materialize/cli/ci_annotate_errors.py, lines 420-475 (main)misc/python/materialize/cli/ci_annotate_errors.py, lines 528-783 (annotate_logged_errors)
Signature
def main() -> int:
"""
ci-annotate-errors detects errors in junit xml as well as log files during CI
and finds associated open GitHub issues in Materialize repository.
"""
def annotate_logged_errors(
log_files: list[str],
test_analytics: TestAnalyticsDb,
test_cmd: str,
test_desc: str,
test_result: int,
) -> tuple[int, bool]:
"""
Returns the number of unknown errors, 0 when all errors are known or there
were no errors logged as well as whether to ignore the test having failed.
This will be used to fail a test even if the test itself succeeded, as long
as it had any unknown error logs.
"""
Import
from materialize.cli.ci_annotate_errors import main, annotate_logged_errors
I/O Contract
Inputs
For main():
| Name | Type | Description |
|---|---|---|
--cloud-hostname |
CLI option (str) | Hostname for the test analytics database connection. |
--test-cmd |
CLI option (str) | The test command that was executed (for annotation context). |
--test-desc |
CLI option (str) | Description of the test (default: empty string). |
--test-result |
CLI option (int) | Exit code of the test command (default: 0). Non-zero indicates test failure. |
log_files |
Positional arguments (list of str) | One or more log file paths to scan for errors. |
For annotate_logged_errors():
| Name | Type | Description |
|---|---|---|
log_files |
list[str] |
Paths to log files and/or JUnit XML files to scan. |
test_analytics |
TestAnalyticsDb |
Database connection for recording build metrics and known issue data. |
test_cmd |
str |
The test command string (used in annotations). |
test_desc |
str |
Human-readable test description (used in annotations). |
test_result |
int |
Exit code of the test (0 = success, non-zero = failure). |
Outputs
For main():
| Name | Type | Description |
|---|---|---|
| Return value | int |
Exit code: 0 if all errors are known and ignorable, 1 if test passed but unknown errors found, otherwise the original test_result.
|
| Buildkite annotation | Side effect | Posts a structured error annotation to the Buildkite build via add_annotation_raw().
|
| Test analytics | Side effect | Records build job status, error classifications, and known issue references in the analytics database. |
For annotate_logged_errors():
| Name | Type | Description |
|---|---|---|
| Return value | tuple[int, bool] |
A tuple of (number_of_unknown_errors, ignore_failure). The first element is the count of errors not matching any known issue. The second is True if all errors are known issues with ci-ignore-failure: true.
|
| Buildkite annotation | Side effect | Posts a structured annotation to the Buildkite build. |
| Test analytics | Side effect | Records annotations and known issues in the analytics database. |
Usage Examples
Running from the command line:
python -m materialize.cli.ci_annotate_errors \
--cloud-hostname analytics.example.com \
--test-cmd "mzcompose run default" \
--test-desc "CDC integration tests" \
--test-result 1 \
junit_*.xml *.log
Calling annotate_logged_errors() programmatically:
from materialize.cli.ci_annotate_errors import annotate_logged_errors
from materialize.test_analytics.config import create_test_analytics_config_with_hostname
from materialize.test_analytics.test_analytics_db import TestAnalyticsDb
config = create_test_analytics_config_with_hostname("analytics.example.com")
db = TestAnalyticsDb(config)
num_unknown, ignore = annotate_logged_errors(
log_files=["test_output.log", "junit_results.xml"],
test_analytics=db,
test_cmd="mzcompose run default",
test_desc="Kafka source tests",
test_result=1,
)
if ignore:
print("All failures are known issues, ignoring")
elif num_unknown > 0:
print(f"Found {num_unknown} unknown errors")
Understanding the classification logic:
# Each known GitHub issue has a regex pattern in its body.
# The system matches error text against these patterns:
#
# Open issue match -> "Known issue" (may ignore failure)
# Closed issue match -> "Potential regression" (never ignore)
# No match -> "Unknown error" (requires investigation)
#
# The ignore_failure flag is True only when ALL of:
# 1. There are known errors
# 2. There are NO unknown errors
# 3. ALL known issues have ci-ignore-failure: true