Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ucbepic Docetl MOARSearch Search

From Leeroopedia


Knowledge Sources
Domains Optimization, Search_Algorithms
Last Updated 2026-02-08 01:40 GMT

Overview

Concrete MCTS search implementation for multi-objective pipeline optimization provided by the MOAR module.

Description

MOARSearch implements Monte Carlo Tree Search over pipeline rewrite directives. It runs concurrent search agents that iterate through selection, expansion, simulation, and backpropagation phases. Each iteration applies a rewrite directive to generate a new YAML pipeline, executes it on a sample dataset, evaluates accuracy, and updates the Pareto frontier.

The search tree uses Node objects representing pipeline variants, and ParetoFrontier to track cost-accuracy tradeoffs using hypervolume indicator calculations.

Usage

MOARSearch is instantiated and called by run_moar_optimization() during docetl build --optimizer moar. It requires a root YAML, available directives, sample data, evaluation function, and search budget.

Code Reference

Source Location

  • Repository: docetl
  • File: docetl/moar/MOARSearch.py
  • Lines: L39-481

Signature

class MOARSearch:
    def __init__(
        self,
        root_yaml_path: str,
        available_actions: set[Directive],
        sample_input,
        dataset_stats: str,
        dataset_name: str,
        available_models: List[str],
        evaluate_func: Callable,
        exploration_constant: float = 1.414,
        max_iterations: int = 20,
        model: str = "gpt-5",
        output_dir: Optional[str] = None,
        build_first_layer: Optional[bool] = True,
        custom_metric_key: Optional[str] = None,
        sample_dataset_path: Optional[str] = None,
    ):
        """Initialize MCTS search with pipeline and search parameters."""

    def search(self) -> List[Node]:
        """Perform MCTS search. Returns Pareto frontier nodes."""

    def search_iteration(self) -> bool:
        """Perform one complete MCTS iteration (select, expand, simulate, backprop)."""

Import

from docetl.moar.MOARSearch import MOARSearch

I/O Contract

Inputs

Name Type Required Description
root_yaml_path str Yes Path to baseline pipeline YAML
available_actions set[Directive] Yes Set of 25+ rewrite directives
sample_input list[dict] Yes Sample dataset for simulation
evaluate_func Callable Yes Scoring function from @register_eval
max_iterations int No MCTS budget (default 20)
exploration_constant float No UCB exploration weight (default 1.414)
available_models List[str] Yes LLM models to explore

Outputs

Name Type Description
search() returns List[Node] Pareto frontier nodes (best cost-accuracy tradeoffs)
YAML files files Optimized pipeline configs written to output_dir
Pareto plots PNG files Cost vs accuracy scatter plots

Usage Examples

from docetl.moar.MOARSearch import MOARSearch
from docetl.reasoning_optimizer.directives import ALL_DIRECTIVES

search = MOARSearch(
    root_yaml_path="pipeline.yaml",
    available_actions=ALL_DIRECTIVES,
    sample_input=sample_data,
    dataset_stats="100 documents, avg 500 tokens",
    dataset_name="legal_docs",
    available_models=["gpt-4o", "gpt-4o-mini"],
    evaluate_func=wrapped_eval,
    max_iterations=30,
    exploration_constant=1.414,
)

frontier_plans = search.search()
for plan in frontier_plans:
    print(f"Plan {plan.id}: cost=${plan.cost:.2f}, accuracy={plan.accuracy:.3f}")

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment