Principle:Eventual Inc Daft AI Prompt Invocation
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Artificial_Intelligence |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Technique for invoking large language models on dataframe columns for batch text generation.
Description
AI prompt invocation sends text prompts from a DataFrame column to an LLM provider (OpenAI, Anthropic, vLLM, etc.) and returns generated responses. This enables large-scale batch inference where each row in a DataFrame is processed through an LLM as part of a data pipeline.
Key capabilities include:
- Multi-provider support: Daft supports multiple LLM providers including OpenAI, Anthropic, vLLM (for local GPU inference), and any OpenAI-compatible API (via custom base URLs such as OpenRouter).
- Structured outputs: By passing a Pydantic
BaseModelas thereturn_format, Daft instructs the LLM to return structured JSON that is automatically parsed into a Struct column matching the model schema. - System messages: A
system_messageparameter allows setting the system prompt for all rows, providing consistent instructions to the LLM. - Multimodal inputs: The
messagesparameter accepts multiple Expression columns, which can include text, images, and file data (PDF, audio, video), enabling multimodal prompting. - Session-based provider management: Providers can be registered with a Daft Session, enabling reuse across multiple prompt calls and centralized API key management.
- Optimized vLLM path: For vLLM providers, Daft uses a specialized execution path with prefix caching and smart routing for maximum GPU utilization.
Usage
Use this technique when you need to process text data through an LLM at scale. Common use cases include:
- Classifying text data (sentiment, topic, intent) using LLM-based classification
- Summarizing documents or text fields in a dataset
- Extracting structured information from unstructured text
- Generating descriptions, translations, or annotations for dataset rows
Theoretical Basis
AI prompt invocation follows a batch inference pattern that distributes LLM API calls across data partitions with configurable concurrency:
- UDF-based execution: The prompt function is implemented as a Daft class-based UDF, enabling it to run within the standard DataFrame execution framework with proper resource management (GPU allocation, concurrency limits, retries).
- Provider abstraction: A provider resolution chain checks (1) explicit provider argument, (2) session-registered providers, (3) environment-based defaults, enabling flexible configuration.
- Structured output parsing: When a Pydantic model is provided, the return type is inferred from the model schema and the LLM response is parsed into the corresponding Daft DataType (Struct with typed fields).
- Adaptive execution path: vLLM providers use a native Rust expression path with prefix caching for local GPU inference, while API-based providers use the UDF path with async HTTP calls.
Pseudocode:
1. Resolve provider (explicit -> session -> environment -> default "openai")
2. Load prompter descriptor from provider
3. Determine return dtype:
a. If return_format is Pydantic model: infer Struct dtype
b. Else: String dtype
4. If vLLM provider:
a. Use native Rust vLLM expression with prefix caching
5. Else (API-based provider):
a. Create class-based UDF with concurrency and retry config
b. For each row in partition:
- Construct messages (text, images, files)
- Send to LLM API
- Parse response (structured or plain text)
c. Return result column