Implementation:Eventual Inc Daft AI Prompt

Knowledge Sources	Daft Daft Docs
Domains	Data_Engineering, Artificial_Intelligence
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for prompting LLMs on DataFrame columns provided by the Daft library.

Description

The prompt function returns an expression that sends text (and optionally images or files) to a large language model and returns the generated response. It supports multiple providers (OpenAI, Anthropic, vLLM, and any OpenAI-compatible API), structured outputs via Pydantic models, system messages, and multimodal inputs. For vLLM providers, a specialized native Rust execution path with prefix caching is used for maximum GPU efficiency.

Usage

Import and use this function when you need to process DataFrame rows through an LLM for classification, summarization, extraction, or generation tasks.

Code Reference

Source Location

Repository: Daft
File: daft/functions/ai/__init__.py
Lines: L453-652

Signature

def prompt(
    messages: list[Expression] | Expression,
    return_format: BaseModel | None = None,
    *,
    system_message: str | None = None,
    provider: str | Provider | None = None,
    model: str | None = None,
    **options: Any,
) -> Expression

Import

from daft.functions.ai import prompt

I/O Contract

Inputs

Name	Type	Required	Description
messages	Expression	Yes	The prompt text expression(s). Each expression can be plain text, image data, or file data (PDF, audio, video).
return_format	None	No	A Pydantic model for structured output. When provided, the LLM response is parsed into the model's schema. Defaults to `None` (plain text response).
system_message	None	No	A system message providing instructions to the LLM. Applied to all rows.
provider	Provider \| None	No	The LLM provider to use (e.g., `"openai"`, `"anthropic"`, `"vllm"`). Defaults to `"openai"`.
model	None	No	The specific model to use (e.g., `"gpt-5-nano"`). If `None`, the provider's default model is used.
**options	Any	No	Additional provider-specific options (e.g., temperature, max_tokens).

Outputs

Name	Type	Description
return	Expression (String or Struct)	A String expression with the LLM response text, or a Struct expression matching the Pydantic model schema when `return_format` is provided.

Usage Examples

Basic Usage

import daft
from daft.functions.ai import prompt

df = daft.from_pydict({"text": ["What is the capital of France?", "What is 2 + 2?"]})
df = df.with_column(
    "response",
    prompt(
        daft.col("text"),
        provider="openai",
        model="gpt-5-nano",
    ),
)
df.show()

Structured Output with Pydantic

import daft
from daft.functions.ai import prompt
from pydantic import BaseModel, Field

class Sentiment(BaseModel):
    label: str = Field(description="Sentiment label: positive, negative, or neutral")
    confidence: float = Field(description="Confidence score between 0 and 1")

df = daft.from_pydict({"review": ["Great product!", "Terrible experience."]})
df = df.with_column(
    "sentiment",
    prompt(
        daft.col("review"),
        return_format=Sentiment,
        system_message="Classify the sentiment of the review.",
        provider="openai",
        model="gpt-5-nano",
    ),
)
df.show()

Related Pages

Implements Principle

Principle:Eventual_Inc_Daft_AI_Prompt_Invocation

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment