Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Ucbepic Docetl Chunk Result Reduction

From Leeroopedia
Revision as of 17:26, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Ucbepic_Docetl_Chunk_Result_Reduction.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains NLP, Data_Aggregation
Last Updated 2026-02-08 01:40 GMT

Overview

An aggregation principle that merges per-chunk LLM results back into per-document summaries using group-by reduction with LLM-powered synthesis.

Description

Chunk Result Reduction reassembles chunk-level analysis results into coherent per-document outputs. After each chunk has been independently processed by MapOperation, the reduce operation groups chunks by their original document ID and synthesizes a unified result using an LLM prompt.

Strategies for handling large groups include:

  • Batch Reduce: Process all chunks in a single LLM call (for small groups)
  • Fold and Merge: Incrementally fold chunks into a running summary (for large groups)
  • Parallel Fold: Process fold batches in parallel with a final merge step

Usage

Apply this principle after chunk-level processing to produce per-document results. The reduce key should be the document ID generated by the split operation.

Theoretical Basis

Group-by reduction with LLM synthesis:

  1. Grouping: Group chunk results by reduce_key (document ID)
  2. Sorting: Order chunks within each group
  3. Strategy Selection: Choose batch, fold, or parallel fold based on group size
  4. LLM Synthesis: Use prompt template to merge chunk results into a unified output
  5. Result Assembly: Produce one output record per document

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment