Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Ucbepic Docetl Output Management

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Observability
Last Updated 2026-02-08 01:40 GMT

Overview

A persistence and observability principle that manages saving pipeline results, checkpointing intermediate outputs, and tracking execution costs through console logging.

Description

Output Management covers the final phase of pipeline execution: persisting results to disk, maintaining intermediate checkpoints for fault tolerance, and providing visibility into execution progress and costs. In DocETL, this includes:

  • Result Persistence: Writing final output as JSON or CSV files
  • Intermediate Checkpointing: Saving per-operation results to enable resumption after failures
  • Console Logging: Thread-safe console output tracking costs, operation progress, and execution summaries
  • Cost Tracking: Aggregating LLM API costs across all operations

Usage

This principle applies whenever a pipeline produces output that needs to be saved, or when operators require visibility into execution progress. It is especially important for long-running pipelines where intermediate checkpointing prevents loss of work.

Theoretical Basis

Output management follows a layered persistence strategy:

  1. Checkpointing: Save intermediate results after each operation completes
  2. Final Output: Write complete pipeline results to the configured output path
  3. Cost Aggregation: Sum per-operation LLM costs into a total
  4. Logging: Provide real-time execution feedback via console output

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment