Principle:Neuml Txtai Workflow Execution
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Workflow, Pipeline |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Workflow Execution is the principle governing how a composed Workflow actually runs when invoked with input data. Execution follows a well-defined lifecycle: create an execution context (thread/process pool), run task initializers, optionally apply stream pre-processing, chunk input into batches, process each batch through every task sequentially, yield results as a generator, and finally run task finalizers. This lifecycle ensures deterministic ordering, efficient resource management, and lazy evaluation of results.
Description
When a Workflow instance is called with an iterable of data elements, it performs the following steps in order:
- Executor Creation: An
Executecontext manager is created with the configured number of workers. This provides a thread or process pool that multi-action tasks can use for concurrent execution. The executor is scoped to the lifetime of the workflow call using awithstatement, ensuring clean resource cleanup.
- Initialization: The workflow iterates over all tasks and calls each task's
initializefunction (if defined). This allows tasks to perform setup operations such as opening database connections, resetting state, or pre-loading resources before any data is processed.
- Stream Processing: If a
streamcallable was provided at composition time, the input elements are passed through it. This enables pre-filtering, augmentation, or transformation of the data before it enters the task chain.
- Chunking: Input elements are divided into batches of the configured size. The chunking algorithm handles both fixed-size inputs (lists, arrays) using efficient slicing and dynamically-generated inputs (generators) using iterative accumulation. This dual approach ensures optimal performance regardless of input type.
- Sequential Task Processing: Each batch passes through every task in order. Within the
processmethod, the batch is fed to task 0, the output becomes input for task 1, and so on. This guarantees that data transformations are applied in the exact order specified during composition.
- Yielding Results: Results are yielded from each batch as they complete, making the workflow a generator. This lazy evaluation means that results are available as soon as each batch finishes processing, without waiting for the entire input to be consumed. It also enables memory-efficient processing of large datasets.
- Finalization: After all batches have been processed, the workflow calls each task's
finalizefunction (if defined). This enables cleanup operations such as flushing buffers, triggering index builds, closing connections, or computing aggregate statistics.
Usage
Use Workflow Execution when you need to:
- Process data through a multi-step NLP pipeline with guaranteed task ordering and batch-level control.
- Handle large datasets efficiently using generator-based lazy evaluation, avoiding loading all results into memory.
- Leverage concurrency for multi-action tasks through the automatically managed executor pool.
- Execute setup and teardown logic around the data processing lifecycle via task initializers and finalizers.
- Process both finite and streaming inputs -- fixed-size lists and infinite generators are both supported by the chunking algorithm.
Theoretical Basis
Workflow Execution combines several established patterns:
Generator-Based Lazy Evaluation: By yielding results rather than returning a complete list, the workflow follows Python's iterator protocol. This enables streaming processing where producers and consumers operate in a pipeline fashion. Memory usage is bounded by the batch size rather than the total input size, making it practical to process datasets that do not fit in memory.
Resource-Scoped Execution: The Execute context manager implements the RAII (Resource Acquisition Is Initialization) pattern. The thread/process pool is created when execution begins and automatically cleaned up when execution ends, even if exceptions occur. This prevents resource leaks in long-running applications.
Batch Processing: The chunking mechanism implements mini-batch processing, a technique from machine learning where inputs are grouped into fixed-size batches. This balances throughput (maximizing GPU utilization) against latency (time to first result) and memory consumption. The dual chunking algorithm (slicing for sized inputs, accumulation for generators) ensures optimal performance across input types.
Sequential Composition with Parallel Execution: Tasks are composed sequentially (the output of one feeds the next), but within each task, multiple actions can execute in parallel via the executor. This provides a hybrid parallelism model that is both easy to reason about (sequential data flow) and efficient (parallel action execution).