Principle:Ucbepic Docetl Programmatic Optimization
| Knowledge Sources | |
|---|---|
| Domains | Optimization, API_Design |
| Last Updated | 2026-02-08 01:40 GMT |
Overview
A programmatic optimization principle that enables automated pipeline rewriting through the Python API, returning an optimized Pipeline object.
Description
Programmatic Optimization exposes DocETL's pipeline optimizer through the Python API's Pipeline.optimize() method. This enables embedding optimization in automated workflows, notebooks, and application code without requiring CLI invocation. The method internally delegates to the V1 Optimizer and returns a new Pipeline instance with optimized operation configurations.
Usage
Use Pipeline.optimize() when you want to programmatically optimize a pipeline. Operations must be marked with optimize=True to be eligible for rewriting.
Theoretical Basis
API-driven optimization:
- Pipeline Construction: Build pipeline with optimize-flagged operations
- Delegation: Pipeline.optimize() creates DSLRunner and invokes Optimizer
- Rewriting: Optimizer rewrites eligible operations for better accuracy/cost
- Return: New Pipeline object with optimized configurations