Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ucbepic Docetl MOAR CliHelpers

From Leeroopedia


Knowledge Sources
Domains Data_Processing, Optimization, CLI
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for CLI helper functions that launch and configure MOAR optimizer runs provided by DocETL.

Description

The cli_helpers module provides functions to infer dataset information from YAML pipeline configs, load custom evaluation functions, and orchestrate a complete MOAR optimization run from the command line. It handles parameter extraction from optimizer_config sections, resolves relative file paths, validates required configuration fields, and initializes MOARSearch with the correct parameters before running the search.

Usage

Use these helpers when running the MOAR optimizer from the CLI via docetl optimize or when programmatically launching MOAR optimization from a YAML pipeline configuration that includes an optimizer_config section.

Code Reference

Source Location

Signature

def infer_dataset_info(yaml_path: str, config: dict) -> tuple[str, str]: ...

def load_evaluation_function(config: dict, dataset_file_path: str) -> callable: ...

def run_moar_optimization(
    yaml_path: str,
    optimizer_config: dict,
) -> Dict[str, Any]: ...

Import

from docetl.moar.cli_helpers import infer_dataset_info, load_evaluation_function, run_moar_optimization

I/O Contract

Inputs

Name Type Required Description
yaml_path str Yes Path to the YAML pipeline configuration file
config dict Yes Full YAML config dictionary (for infer_dataset_info)
optimizer_config dict Yes Dictionary from the YAML optimizer_config section containing save_dir, available_models, evaluation_file, metric_key, max_iterations
dataset_file_path str Yes Path to the dataset file (for load_evaluation_function)

Outputs

Name Type Description
dataset_path str Resolved absolute path to the dataset file
dataset_name str Name of the dataset from the YAML config
evaluate_func callable Loaded evaluation function decorated with @docetl.register_eval
experiment_summary Dict[str, Any] Summary dictionary with optimization results, costs, and timings

Usage Examples

from docetl.moar.cli_helpers import run_moar_optimization
import yaml

# Load a pipeline YAML with optimizer_config
with open("my_pipeline.yaml", "r") as f:
    config = yaml.safe_load(f)

optimizer_config = config.get("optimizer_config", {})

# Run MOAR optimization
results = run_moar_optimization(
    yaml_path="my_pipeline.yaml",
    optimizer_config=optimizer_config,
)

print(f"Best pipeline cost: {results['best_cost']}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment