Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:MaterializeInc Materialize Annotate Logged Errors

From Leeroopedia


Knowledge Sources Materialize CI error annotation system, GitHub Issues API, Buildkite annotations API, JUnit XML parsing
Domains Continuous Integration, Error Classification, Issue Tracking, Test Analytics, Python
Last Updated 2026-02-08

Overview

Concrete Python functions for automated CI error triage provided by Materialize's ci_annotate_errors.py, which scans test log files and JUnit XML reports, matches errors against known GitHub issues via regex, classifies them as known or unknown, and annotates Buildkite builds with structured error information.

Description

The annotate_logged_errors() function is the core error classification engine. It processes test output from a completed CI step and produces structured annotations. The companion main() function serves as the CLI entry point.

main() function:

  1. Parses CLI arguments: --cloud-hostname, --test-cmd, --test-desc, --test-result, and positional log_files.
  2. Initializes a TestAnalyticsDb connection for recording build metrics.
  3. Records the build job (successful or not) in test analytics.
  4. Calls annotate_logged_errors() to classify errors.
  5. Submits analytics updates, handling upload failures gracefully (never failing the build due to analytics errors).
  6. Determines the final exit code:
    • Returns 0 if all errors are known issues with ci-ignore-failure: true.
    • Returns 1 if the test passed but unknown errors were found in logs.
    • Otherwise returns the original test_result.

annotate_logged_errors() function:

  1. Asynchronous artifact fetching: Submits a background task to fetch Buildkite artifacts for URL generation.
  2. Error collection: Calls get_errors() to scan log files (regex matching) and parse JUnit XML reports, producing a list of ErrorLog, JunitError, and Secret objects.
  3. GitHub issue fetching: Retrieves all open GitHub issues labeled with CI error patterns via get_known_issues_from_github(). Each issue contains a compiled regex and metadata (state, apply_to filter, location filter, ignore_failure flag).
  4. Error classification: For each error, the inner handle_error() closure:
    • Searches open issues for a regex match. If found and filters pass (step key, location), classifies as known issue.
    • Searches closed issues for a regex match. If found, classifies as potential regression.
    • If no match, classifies as unknown error.
    • Deduplicates by issue number, reporting each issue at most once.
  5. Error source processing: Iterates over collected errors:
    • ErrorLog: Resolves artifact URLs from Buildkite, passes match text to handle_error().
    • JunitError: Extracts message, error details, and optional collapsed details (separated by a special delimiter). Coverage-related failures are classified separately as FailureInCoverageRun.
    • Secret: Formats a warning message with the detector name and passes to handle_error().
  6. Ignore-failure determination: Sets ignore_failure=True only if there are known errors, no unknown errors, and all known errors have issue_ignore_failure=True.
  7. Annotation: Calls annotate_errors() to post a structured Buildkite annotation with all classified errors and main branch failure history.
  8. Fallback annotation: If no errors were found but the test failed (and the build is not on main and not canceled), posts a generic failure annotation with main branch status.
  9. Analytics storage: Records known issues in the test analytics database.
  10. Returns (number_of_unknown_errors, ignore_failure).

Usage

These functions are invoked automatically at the end of each CI test step via Materialize's CI plugins. The main() function is the standard entry point, called as python -m materialize.cli.ci_annotate_errors <log_files>.

Code Reference

Source Location

  • misc/python/materialize/cli/ci_annotate_errors.py, lines 420-475 (main)
  • misc/python/materialize/cli/ci_annotate_errors.py, lines 528-783 (annotate_logged_errors)

Signature

def main() -> int:
    """
    ci-annotate-errors detects errors in junit xml as well as log files during CI
    and finds associated open GitHub issues in Materialize repository.
    """
def annotate_logged_errors(
    log_files: list[str],
    test_analytics: TestAnalyticsDb,
    test_cmd: str,
    test_desc: str,
    test_result: int,
) -> tuple[int, bool]:
    """
    Returns the number of unknown errors, 0 when all errors are known or there
    were no errors logged as well as whether to ignore the test having failed.
    This will be used to fail a test even if the test itself succeeded, as long
    as it had any unknown error logs.
    """

Import

from materialize.cli.ci_annotate_errors import main, annotate_logged_errors

I/O Contract

Inputs

For main():

Name Type Description
--cloud-hostname CLI option (str) Hostname for the test analytics database connection.
--test-cmd CLI option (str) The test command that was executed (for annotation context).
--test-desc CLI option (str) Description of the test (default: empty string).
--test-result CLI option (int) Exit code of the test command (default: 0). Non-zero indicates test failure.
log_files Positional arguments (list of str) One or more log file paths to scan for errors.

For annotate_logged_errors():

Name Type Description
log_files list[str] Paths to log files and/or JUnit XML files to scan.
test_analytics TestAnalyticsDb Database connection for recording build metrics and known issue data.
test_cmd str The test command string (used in annotations).
test_desc str Human-readable test description (used in annotations).
test_result int Exit code of the test (0 = success, non-zero = failure).

Outputs

For main():

Name Type Description
Return value int Exit code: 0 if all errors are known and ignorable, 1 if test passed but unknown errors found, otherwise the original test_result.
Buildkite annotation Side effect Posts a structured error annotation to the Buildkite build via add_annotation_raw().
Test analytics Side effect Records build job status, error classifications, and known issue references in the analytics database.

For annotate_logged_errors():

Name Type Description
Return value tuple[int, bool] A tuple of (number_of_unknown_errors, ignore_failure). The first element is the count of errors not matching any known issue. The second is True if all errors are known issues with ci-ignore-failure: true.
Buildkite annotation Side effect Posts a structured annotation to the Buildkite build.
Test analytics Side effect Records annotations and known issues in the analytics database.

Usage Examples

Running from the command line:

python -m materialize.cli.ci_annotate_errors \
    --cloud-hostname analytics.example.com \
    --test-cmd "mzcompose run default" \
    --test-desc "CDC integration tests" \
    --test-result 1 \
    junit_*.xml *.log

Calling annotate_logged_errors() programmatically:

from materialize.cli.ci_annotate_errors import annotate_logged_errors
from materialize.test_analytics.config import create_test_analytics_config_with_hostname
from materialize.test_analytics.test_analytics_db import TestAnalyticsDb

config = create_test_analytics_config_with_hostname("analytics.example.com")
db = TestAnalyticsDb(config)

num_unknown, ignore = annotate_logged_errors(
    log_files=["test_output.log", "junit_results.xml"],
    test_analytics=db,
    test_cmd="mzcompose run default",
    test_desc="Kafka source tests",
    test_result=1,
)

if ignore:
    print("All failures are known issues, ignoring")
elif num_unknown > 0:
    print(f"Found {num_unknown} unknown errors")

Understanding the classification logic:

# Each known GitHub issue has a regex pattern in its body.
# The system matches error text against these patterns:
#
# Open issue match   -> "Known issue" (may ignore failure)
# Closed issue match -> "Potential regression" (never ignore)
# No match           -> "Unknown error" (requires investigation)
#
# The ignore_failure flag is True only when ALL of:
#   1. There are known errors
#   2. There are NO unknown errors
#   3. ALL known issues have ci-ignore-failure: true

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment