Implementation:Ucbepic Docetl Directive AgentUtils
| Knowledge Sources | |
|---|---|
| Domains | Pipeline_Optimization, LLM_Operations |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete utility module for agentic directive execution provided by the DocETL reasoning optimizer.
Description
The agent_utils.py module provides shared infrastructure for directives that use agentic document-reading loops. It contains the AgenticDirectiveRunner class, which manages context windows, iterative document reading via ReadNextDocTool, and LLM-driven decision-making loops. It also includes helper classes like AgentDecision (a Pydantic schema for action selection) and utilities such as estimate_token_count and truncate_message_content.
Usage
Used internally by directives that require iterative document sampling before instantiation, such as ArbitraryRewriteDirective, CascadeFilteringDirective, ClarifyInstructionsDirective, and SwapWithCodeDirective. The MOAR agent does not invoke this module directly.
Code Reference
Source Location
- Repository: Ucbepic_Docetl
- File: docetl/reasoning_optimizer/directives/agent_utils.py
- Lines: 1-458
Signature
class AgenticDirectiveRunner:
"""Utility class for running agentic directives that iteratively process documents."""
def __init__(self, input_data: List[Dict], agent_llm: str = "gpt-4.1-mini",
validation_func: Optional[callable] = None, enable_operator_docs: bool = False): ...
def _get_model_context_window(self, model: str) -> int: ...
def _read_operator_doc(self, operator_name: str) -> Optional[str]: ...
class ReadNextDocTool:
"""Tool for iteratively reading documents from input data."""
def read_next_docs(self, count: int = None) -> List[Dict]: ...
def has_more_docs(self) -> bool: ...
class AgentDecision(BaseModel):
action: Literal["read_next_docs", "read_operator_doc", "output_schema"]
reasoning: str
operator_name: Optional[str]
Import
from docetl.reasoning_optimizer.directives.agent_utils import AgenticDirectiveRunner, ReadNextDocTool, AgentDecision
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input_data | List[Dict] | Yes | Sample documents for iterative reading |
| agent_llm | str | No | LLM model for agent decisions (default: gpt-4.1-mini) |
| validation_func | callable | No | Optional validation function for agent outputs |
| enable_operator_docs | bool | No | Whether to allow reading operator documentation |
Outputs
| Name | Type | Description |
|---|---|---|
| schema_output | BaseModel | Instantiation schema produced by the agentic loop |
| message_history | List[Dict] | Conversation history from the agentic session |
Usage Examples
# AgenticDirectiveRunner is used internally by agentic directives
from docetl.reasoning_optimizer.directives.agent_utils import AgenticDirectiveRunner
runner = AgenticDirectiveRunner(
input_data=sample_documents,
agent_llm="gpt-4.1-mini",
enable_operator_docs=True
)
# The runner manages iterative document reading and LLM decision loops