Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Datatrove FailedLogs

From Leeroopedia
Knowledge Sources
Domains Pipeline Operations, Debugging
Last Updated 2026-02-14 17:00 GMT

Overview

FailedLogs is a command-line tool that identifies incomplete pipeline tasks and displays their log files to assist in diagnosing failures.

Description

The failed_logs module provides a diagnostic tool for investigating pipeline task failures. It reads the executor.json configuration from a logging directory to determine the total number of expected tasks (world_size), then scans the completions subdirectory to identify which task ranks did not complete successfully. For each incomplete task, it locates the corresponding log file using a regex pattern (`logs/task_XXXXX.log`) and displays the log content through a paginated console interface.

The tool uses the rich library for formatted console output, including status spinners during directory scanning and interactive pagination through log files. After displaying each log, it prompts the user with a Confirm dialog to decide whether to view the next incomplete task's log, allowing selective review of failures.

The tool extracts the task rank from log filenames using a compiled regex pattern (RANK_FROM_LOG_FILENAME_REGEX) and cross-references these against the set of incomplete ranks to display only the relevant logs.

Usage

Use this tool when a distributed pipeline run has incomplete tasks and you need to investigate why specific tasks failed. Point it at the logging directory of the run and it will identify and display the logs of all failed tasks.

Code Reference

Source Location

Signature

def main():
    """
    Takes a `logging_dir` as input, gets total number of tasks from `executor.json`
    and then gets which ranks are incomplete by scanning `logging_dir/completions`.
    The log files for the incomplete tasks are then displayed.
    """

Import

from datatrove.tools.failed_logs import main

I/O Contract

Inputs

Name Type Required Description
path str (CLI argument) No Path to the logging folder containing executor.json (default: current directory)

Outputs

Name Type Description
Console output Rich formatted text Paginated display of log files for each incomplete task

Usage Examples

Basic Usage

# From the command line
python -m datatrove.tools.failed_logs /path/to/logging_dir
# Programmatic usage
from datatrove.tools.failed_logs import main
import sys

sys.argv = ["failed_logs", "/path/to/logging_dir"]
main()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment