Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ucbepic Docetl Build CLI Command

From Leeroopedia
Revision as of 16:59, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Ucbepic_Docetl_Build_CLI_Command.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Optimization, CLI
Last Updated 2026-02-08 01:40 GMT

Overview

Concrete CLI command and helper for configuring and launching DocETL pipeline optimization.

Description

The docetl build CLI command parses the optimizer_config section from a YAML pipeline file, validates required fields (evaluation_file, metric_key, available_models, max_iterations, save_dir), infers dataset information, and dispatches to either the V1 or MOAR optimizer. The run_moar_optimization() helper function handles MOAR-specific setup.

Usage

Use docetl build pipeline.yaml --optimizer moar to launch MOAR optimization. The pipeline YAML must include an optimizer_config section with all required fields.

Code Reference

Source Location

  • Repository: docetl
  • File: docetl/cli.py (L19-198), docetl/moar/cli_helpers.py (L92-305)

Signature

# CLI command
def build(
    yaml_file: Path,
    optimizer: str = "moar",
    max_threads: int | None = None,
    resume: bool = False,
    save_path: Path = None,
) -> None:
    """Build/optimize a DocETL pipeline."""

# MOAR helper
def run_moar_optimization(
    yaml_path: str,
    optimizer_config: dict,
) -> Dict[str, Any]:
    """Run MOAR optimization from CLI. Returns experiment summary."""

Import

# CLI usage
docetl build pipeline.yaml --optimizer moar

I/O Contract

Inputs

Name Type Required Description
yaml_file Path Yes Path to YAML pipeline with optimizer_config
optimizer str No "moar" (default) or "v1"
optimizer_config.evaluation_file str Yes Path to @register_eval Python file
optimizer_config.metric_key str Yes Key in evaluation results dict to optimize
optimizer_config.available_models list[str] Yes LLM models to search over
optimizer_config.max_iterations int Yes MCTS iteration budget
optimizer_config.save_dir str Yes Output directory for optimized pipelines

Outputs

Name Type Description
results Dict[str, Any] Experiment summary with paths to optimized pipelines
optimized YAMLs files Written to save_dir

Usage Examples

# Run MOAR optimization
docetl build pipeline.yaml --optimizer moar

# Resume interrupted optimization
docetl build pipeline.yaml --optimizer moar --resume
# optimizer_config section in pipeline.yaml
optimizer_config:
  type: moar
  save_dir: ./moar_results
  available_models:
    - gpt-4o
    - gpt-4o-mini
  evaluation_file: evaluate.py
  metric_key: accuracy
  max_iterations: 40
  rewrite_agent_model: gpt-4o

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment