Principle:Ucbepic Docetl Pipeline Optimization Search
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Search_Algorithms |
| Last Updated | 2026-02-08 01:40 GMT |
Overview
A search algorithm principle that uses Monte Carlo Tree Search (MCTS) to explore pipeline rewrite candidates, optimizing for accuracy and cost on a Pareto frontier.
Description
Pipeline Optimization Search applies MCTS to the problem of automatically rewriting LLM pipeline operations for better accuracy and lower cost. The MOAR (Multi-Objective Agent-based Rewriting) algorithm:
- Selection: Choose a promising node in the search tree using UCB (Upper Confidence Bound)
- Expansion: Apply a rewrite directive (from 25+ available) to generate a new pipeline variant
- Simulation: Execute the variant pipeline on a sample dataset and evaluate accuracy
- Backpropagation: Update node values throughout the tree based on the result
The search maintains a Pareto frontier tracking the tradeoff between accuracy and cost. Rewrite directives include operation chaining, gleaning, chunking, model swapping, fusion, compression, and more.
Usage
This principle is applied when running the MOAR optimizer (docetl build --optimizer moar). It requires an evaluation function to score pipeline quality and a set of available LLM models to search over.
Theoretical Basis
MCTS for pipeline optimization:
Where is the average reward (accuracy change), is the parent visit count, is the node visit count, and is the exploration constant.
The Pareto frontier is maintained using hypervolume indicator calculations to track cost-accuracy tradeoffs across all explored pipeline variants.