Heuristic:Deepset ai Haystack Pipeline Max Runs Safety Limit
| Knowledge Sources | |
|---|---|
| Domains | Pipeline_Execution, Debugging |
| Last Updated | 2026-02-11 20:00 GMT |
Overview
Pipelines enforce a maximum of 100 runs per component (default) to prevent infinite loops in cyclic graphs; increase this value only for intentionally iterative workflows.
Description
Haystack pipelines support cyclic execution graphs where components can feed output back into earlier components. This enables iterative refinement workflows (e.g., agent loops, self-correcting RAG). To prevent runaway execution, the pipeline tracks how many times each component has been invoked and raises `PipelineMaxComponentRuns` when any component exceeds `max_runs_per_component` (default 100). Cyclic components are detected via graph condensation of strongly connected components, where components within the same cycle share execution priority.
Usage
Be aware of this heuristic when building agent-based pipelines with feedback loops, debugging PipelineMaxComponentRuns exceptions, or designing iterative refinement workflows. If a pipeline legitimately needs more than 100 iterations (rare), increase `max_runs_per_component`. If hitting this limit unexpectedly, investigate the cycle in your pipeline graph for unintended feedback loops.
The Insight (Rule of Thumb)
- Action: Set `max_runs_per_component` when constructing the Pipeline.
- Value: Default is 100. For agent loops, 10-20 is often sufficient. For convergence-based workflows, set based on expected worst-case iterations.
- Trade-off: Too low prevents legitimate iterative processing; too high delays detection of infinite loops.
- Cycle detection: Pipelines use NetworkX graph condensation to identify strongly connected components (cycles) and sort them topologically for deterministic execution.
Reasoning
Without a safety limit, a misconfigured pipeline with a cycle would run forever, consuming compute resources. The default of 100 was chosen to be generous enough for most iterative workflows (agent loops typically converge in 3-10 steps) while catching genuine infinite loops before significant resource waste.
The cycle handling uses graph condensation: strongly connected components (components in cycles) are collapsed into single nodes, then topologically sorted. Within a cycle, components share the same priority and execute in lexicographic order.
Code evidence from `haystack/core/pipeline/base.py:88-103`:
def __init__(
self,
metadata: dict[str, Any] | None = None,
max_runs_per_component: int = 100,
connection_type_validation: bool = True,
):
"""
:param max_runs_per_component:
How many times the `Pipeline` can run the same Component.
If this limit is reached a `PipelineMaxComponentRuns` exception is raised.
If not set defaults to 100 runs per Component.
"""
Cycle detection via graph condensation from `haystack/core/pipeline/base.py:1287-1302`:
if topological_sort is None:
if networkx.is_directed_acyclic_graph(self.graph):
topological_sort = networkx.lexicographical_topological_sort(self.graph)
topological_sort = {node: idx for idx, node in enumerate(topological_sort)}
else:
# If the graph is not a DAG, we use the condensation of the graph
# to get a topological sort of the strongly connected components.
condensed = networkx.condensation(self.graph)
condensed_sorted = {
node: idx for idx, node in enumerate(networkx.topological_sort(condensed))
}
topological_sort = {
component_name: condensed_sorted[node]
for component_name, node in condensed.graph["mapping"].items()
}