Environment:Eventual Inc Daft AI Provider Dependencies

Knowledge Sources	Daft
Domains	AI/ML, LLM, Embeddings, GPU Computing, Transformers
Last Updated	2026-02-08 15:30 GMT

Overview

The AI_Provider_Dependencies environment defines the optional Python packages, API credentials, and GPU configuration required for Daft's AI integrations, including LLM prompting (OpenAI, Google Gemini, vLLM), text/image embedding (Transformers, Sentence Transformers), and GPU-accelerated UDF execution.

Description

Daft provides first-class AI integrations that allow users to apply large language models and embedding models directly within dataframe operations. These integrations are organized into several provider-specific extras, each bringing its own set of dependencies:

OpenAI -- Cloud-based LLM and embedding API via the OpenAI Python client. Requires an API key and supports structured outputs via Pydantic V2 schemas.
Google Gemini -- Cloud-based LLM via the Google GenAI client library.
Transformers/HuggingFace -- Local model inference using PyTorch-backed Transformers and Sentence Transformers. Supports CPU and GPU execution with automatic device detection.
vLLM -- High-throughput local LLM inference with prefix caching. Available as a development dependency.

Daft includes automatic GPU detection that probes for CUDA (NVIDIA GPUs), MPS (Apple Silicon), and falls back to CPU. When GPUs are available, the AI functions automatically configure UDF concurrency and resource allocation differently depending on whether the native or Ray runner is in use.

Pydantic V2 is required for structured output support, where LLM responses are parsed into typed data structures defined by Pydantic models.

Usage

Use this environment when:

Applying LLM prompting to dataframe columns (e.g., text classification, summarization, generation)
Computing text or image embeddings at scale
Running local transformer models on GPU or CPU
Using structured outputs with Pydantic V2 schemas
Deploying GPU-accelerated UDFs across native or Ray runners

System Requirements

Category	Requirement	Notes
Python	>= 3.10	Inherited from the core environment
CUDA	>= 11.8 (for GPU support)	Required for NVIDIA GPU acceleration with PyTorch/vLLM
MPS	macOS with Apple Silicon	Alternative GPU backend for Apple Silicon (M1/M2/M3/M4)
RAM	8 GB+ recommended	Local model inference can require significant memory; varies by model size
GPU VRAM	Varies by model	Transformer models may require 2-48+ GB VRAM depending on model size

Dependencies

System Packages

All core environment system packages
CUDA toolkit (optional) -- for NVIDIA GPU support with PyTorch
cuDNN (optional) -- for optimized GPU neural network operations

Python Packages

OpenAI Extra (`daft[openai]`)

openai < 2.9.0 -- OpenAI Python client library
numpy < 2.4.0 -- Numerical computing (required for embedding operations)
pillow == 11.0.0 -- Image processing (required for multimodal inputs)

Google Gemini Extra (`daft[google]`)

google-genai < 1.53.0 -- Google GenAI client library
numpy < 2.4.0 -- Numerical computing
pillow == 11.0.0 -- Image processing

Transformers Extra (`daft[transformers]`)

transformers < 4.58.0 -- HuggingFace Transformers library
sentence-transformers < 5.2.0 -- Sentence embedding models
torch < 2.10.0 -- PyTorch deep learning framework
torchvision < 0.25.0 -- PyTorch vision utilities
pillow == 11.0.0 -- Image processing

vLLM (Dev Dependency)

vllm == 0.11.0 -- High-throughput LLM inference engine with prefix caching

Shared Requirements

pydantic V2 (>= 2.0.0, < 3.0.0) -- Required for structured output support (type inference from Pydantic models)

Credentials

Variable	Provider	Description	Required
`OPENAI_API_KEY`	OpenAI	API key for OpenAI endpoints (GPT, embeddings)	Yes (when using OpenAI provider)
`OPENROUTER_API_KEY`	OpenRouter	API key for OpenRouter multi-provider LLM gateway	Yes (when using OpenRouter provider)
`GOOGLE_API_KEY`	Google Gemini	API key for Google GenAI (Gemini models)	Yes (when using Google provider)

Quick Install

# OpenAI support (cloud-based LLM and embeddings)
pip install "daft[openai]"

# Google Gemini support (cloud-based LLM)
pip install "daft[google]"

# HuggingFace Transformers support (local model inference)
pip install "daft[transformers]"

# All AI providers
pip install "daft[openai,google,transformers]"

GPU Detection and Allocation

Daft includes automatic GPU detection and resource allocation logic that adapts to the active runner.

Device Detection

From daft/ai/utils.py, the get_torch_device() function probes available hardware in order of preference:

CUDA -- NVIDIA GPU with CUDA support (highest priority)
MPS -- Apple Silicon Metal Performance Shaders
CPU -- Fallback when no GPU is available

GPU UDF Allocation by Runner

Runner	GPU Detection Method	Behavior
Native	Reads `CUDA_VISIBLE_DEVICES` environment variable	Uses all GPUs visible to the current process. Sets UDF concurrency to the number of GPUs, each UDF instance gets 1 GPU.
Ray	Queries `ray.nodes()` for cluster GPU resources	Counts total GPUs across all Ray cluster nodes. Sets UDF concurrency to total GPU count, each actor requests 1 GPU.
No GPUs	N/A	Falls back to CPU execution with default thread-based concurrency.

Code Evidence

OpenAI dependencies from pyproject.toml line 49:

openai = ["openai<2.9.0", "numpy<2.4.0", "pillow==11.0.0"]

Google Gemini dependencies from pyproject.toml line 39:

google = ["google-genai<1.53.0", "numpy<2.4.0", "pillow==11.0.0"]

Transformers dependencies from pyproject.toml line 57:

transformers = ["transformers<4.58.0", "sentence-transformers<5.2.0", "torch<2.10.0", "torchvision<0.25.0", "pillow==11.0.0"]

GPU device detection from daft/ai/utils.py lines 19-32:

def get_torch_device() -> torch.device:
    """Get the best available PyTorch device for computation."""
    import torch

    # 1. CUDA GPU (if available) - for NVIDIA GPUs with CUDA support
    if torch.cuda.is_available():
        return torch.device("cuda")

    # 2. MPS (Metal Performance Shaders) - for Apple Silicon Macs
    if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
        return torch.device("mps")

    # 3. CPU - as fallback when no GPU acceleration is available
    return torch.device("cpu")

GPU UDF options from daft/ai/utils.py lines 35-62:

def get_gpu_udf_options() -> UDFOptions:
    """Get UDF options for GPU-based providers."""
    runner = get_or_infer_runner_type()

    if runner == "native":
        from daft.internal.gpu import cuda_visible_devices
        num_gpus = len(cuda_visible_devices())
    elif runner == "ray":
        import ray
        num_gpus = 0
        for node in ray.nodes():
            if "Resources" in node:
                if "GPU" in node["Resources"] and node["Resources"]["GPU"] > 0:
                    num_gpus += int(node["Resources"]["GPU"])

    if num_gpus > 0:
        return UDFOptions(concurrency=num_gpus, num_gpus=1)
    else:
        return UDFOptions(concurrency=None, num_gpus=None)

Pydantic V2 requirement from daft/datatype.py lines 254-257:

if not (parse("2.0.0") <= parse(pydantic.__version__) < parse("3.0.0")):
    raise ValueError(
        f"Daft only supports DataType inference for Pydantic V2, found Pydantic V{pydantic.__version__}"
    )

Common Errors

Error Message	Cause	Solution
`ImportError: No module named 'openai'`	OpenAI client library not installed.	Run `pip install "daft[openai]"`.
`openai.AuthenticationError: Incorrect API key`	The `OPENAI_API_KEY` environment variable is missing or invalid.	Set a valid API key: `export OPENAI_API_KEY="sk-..."`.
`ImportError: No module named 'torch'`	PyTorch not installed (needed for local transformer models).	Run `pip install "daft[transformers]"`.
`Daft only supports DataType inference for Pydantic V2`	Pydantic V1 is installed instead of V2.	Upgrade Pydantic: `pip install "pydantic>=2.0.0,<3.0.0"`.
`CUDA out of memory`	The model exceeds available GPU VRAM.	Use a smaller model, reduce batch size, or use CPU fallback. Consider using vLLM for more efficient memory management.
`Invalid runner type: ...`	The runner type is not recognized when configuring GPU UDF options.	Ensure `DAFT_RUNNER` is set to either `"native"` or `"ray"`.

Compatibility Notes

Pillow is pinned to exactly 11.0.0 across all AI extras to avoid compatibility issues between image processing in different providers.
numpy is constrained to < 2.4.0 for OpenAI and Google extras; the Transformers extra does not explicitly pin numpy but inherits it via torch.
Pydantic V2 (>= 2.0.0, < 3.0.0) is strictly required for structured output support. Pydantic V1 is not supported and will raise a ValueError at runtime.
vLLM is available only as a development dependency (pinned at 0.11.0) and is not included in any user-facing extras. It is used for high-throughput local inference with prefix caching.
MPS (Apple Silicon) support depends on PyTorch's Metal backend, which may have feature gaps compared to CUDA. Not all operations are supported on MPS.
GPU allocation behavior differs between the native and Ray runners. The native runner uses CUDA_VISIBLE_DEVICES to determine available GPUs, while the Ray runner queries the cluster's resource metadata.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment