Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Ucbepic Docetl Chunk Result Reduction

From Leeroopedia


Knowledge Sources
Domains NLP, Data_Aggregation
Last Updated 2026-02-08 01:40 GMT

Overview

An aggregation principle that merges per-chunk LLM results back into per-document summaries using group-by reduction with LLM-powered synthesis.

Description

Chunk Result Reduction reassembles chunk-level analysis results into coherent per-document outputs. After each chunk has been independently processed by MapOperation, the reduce operation groups chunks by their original document ID and synthesizes a unified result using an LLM prompt.

Strategies for handling large groups include:

  • Batch Reduce: Process all chunks in a single LLM call (for small groups)
  • Fold and Merge: Incrementally fold chunks into a running summary (for large groups)
  • Parallel Fold: Process fold batches in parallel with a final merge step

Usage

Apply this principle after chunk-level processing to produce per-document results. The reduce key should be the document ID generated by the split operation.

Theoretical Basis

Group-by reduction with LLM synthesis:

  1. Grouping: Group chunk results by reduce_key (document ID)
  2. Sorting: Order chunks within each group
  3. Strategy Selection: Choose batch, fold, or parallel fold based on group size
  4. LLM Synthesis: Use prompt template to merge chunk results into a unified output
  5. Result Assembly: Produce one output record per document

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment