Environment:Eventual Inc Daft AI Provider Dependencies
| Knowledge Sources | |
|---|---|
| Domains | AI/ML, LLM, Embeddings, GPU Computing, Transformers |
| Last Updated | 2026-02-08 15:30 GMT |
Overview
The AI_Provider_Dependencies environment defines the optional Python packages, API credentials, and GPU configuration required for Daft's AI integrations, including LLM prompting (OpenAI, Google Gemini, vLLM), text/image embedding (Transformers, Sentence Transformers), and GPU-accelerated UDF execution.
Description
Daft provides first-class AI integrations that allow users to apply large language models and embedding models directly within dataframe operations. These integrations are organized into several provider-specific extras, each bringing its own set of dependencies:
- OpenAI -- Cloud-based LLM and embedding API via the OpenAI Python client. Requires an API key and supports structured outputs via Pydantic V2 schemas.
- Google Gemini -- Cloud-based LLM via the Google GenAI client library.
- Transformers/HuggingFace -- Local model inference using PyTorch-backed Transformers and Sentence Transformers. Supports CPU and GPU execution with automatic device detection.
- vLLM -- High-throughput local LLM inference with prefix caching. Available as a development dependency.
Daft includes automatic GPU detection that probes for CUDA (NVIDIA GPUs), MPS (Apple Silicon), and falls back to CPU. When GPUs are available, the AI functions automatically configure UDF concurrency and resource allocation differently depending on whether the native or Ray runner is in use.
Pydantic V2 is required for structured output support, where LLM responses are parsed into typed data structures defined by Pydantic models.
Usage
Use this environment when:
- Applying LLM prompting to dataframe columns (e.g., text classification, summarization, generation)
- Computing text or image embeddings at scale
- Running local transformer models on GPU or CPU
- Using structured outputs with Pydantic V2 schemas
- Deploying GPU-accelerated UDFs across native or Ray runners
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Python | >= 3.10 | Inherited from the core environment |
| CUDA | >= 11.8 (for GPU support) | Required for NVIDIA GPU acceleration with PyTorch/vLLM |
| MPS | macOS with Apple Silicon | Alternative GPU backend for Apple Silicon (M1/M2/M3/M4) |
| RAM | 8 GB+ recommended | Local model inference can require significant memory; varies by model size |
| GPU VRAM | Varies by model | Transformer models may require 2-48+ GB VRAM depending on model size |
Dependencies
System Packages
- All core environment system packages
- CUDA toolkit (optional) -- for NVIDIA GPU support with PyTorch
- cuDNN (optional) -- for optimized GPU neural network operations
Python Packages
OpenAI Extra (daft[openai])
- openai < 2.9.0 -- OpenAI Python client library
- numpy < 2.4.0 -- Numerical computing (required for embedding operations)
- pillow == 11.0.0 -- Image processing (required for multimodal inputs)
Google Gemini Extra (daft[google])
- google-genai < 1.53.0 -- Google GenAI client library
- numpy < 2.4.0 -- Numerical computing
- pillow == 11.0.0 -- Image processing
Transformers Extra (daft[transformers])
- transformers < 4.58.0 -- HuggingFace Transformers library
- sentence-transformers < 5.2.0 -- Sentence embedding models
- torch < 2.10.0 -- PyTorch deep learning framework
- torchvision < 0.25.0 -- PyTorch vision utilities
- pillow == 11.0.0 -- Image processing
vLLM (Dev Dependency)
- vllm == 0.11.0 -- High-throughput LLM inference engine with prefix caching
- pydantic V2 (>= 2.0.0, < 3.0.0) -- Required for structured output support (type inference from Pydantic models)
Credentials
| Variable | Provider | Description | Required |
|---|---|---|---|
OPENAI_API_KEY |
OpenAI | API key for OpenAI endpoints (GPT, embeddings) | Yes (when using OpenAI provider) |
OPENROUTER_API_KEY |
OpenRouter | API key for OpenRouter multi-provider LLM gateway | Yes (when using OpenRouter provider) |
GOOGLE_API_KEY |
Google Gemini | API key for Google GenAI (Gemini models) | Yes (when using Google provider) |
Quick Install
# OpenAI support (cloud-based LLM and embeddings)
pip install "daft[openai]"
# Google Gemini support (cloud-based LLM)
pip install "daft[google]"
# HuggingFace Transformers support (local model inference)
pip install "daft[transformers]"
# All AI providers
pip install "daft[openai,google,transformers]"
GPU Detection and Allocation
Daft includes automatic GPU detection and resource allocation logic that adapts to the active runner.
Device Detection
From daft/ai/utils.py, the get_torch_device() function probes available hardware in order of preference:
- CUDA -- NVIDIA GPU with CUDA support (highest priority)
- MPS -- Apple Silicon Metal Performance Shaders
- CPU -- Fallback when no GPU is available
GPU UDF Allocation by Runner
| Runner | GPU Detection Method | Behavior |
|---|---|---|
| Native | Reads CUDA_VISIBLE_DEVICES environment variable |
Uses all GPUs visible to the current process. Sets UDF concurrency to the number of GPUs, each UDF instance gets 1 GPU. |
| Ray | Queries ray.nodes() for cluster GPU resources |
Counts total GPUs across all Ray cluster nodes. Sets UDF concurrency to total GPU count, each actor requests 1 GPU. |
| No GPUs | N/A | Falls back to CPU execution with default thread-based concurrency. |
Code Evidence
OpenAI dependencies from pyproject.toml line 49:
openai = ["openai<2.9.0", "numpy<2.4.0", "pillow==11.0.0"]
Google Gemini dependencies from pyproject.toml line 39:
google = ["google-genai<1.53.0", "numpy<2.4.0", "pillow==11.0.0"]
Transformers dependencies from pyproject.toml line 57:
transformers = ["transformers<4.58.0", "sentence-transformers<5.2.0", "torch<2.10.0", "torchvision<0.25.0", "pillow==11.0.0"]
GPU device detection from daft/ai/utils.py lines 19-32:
def get_torch_device() -> torch.device:
"""Get the best available PyTorch device for computation."""
import torch
# 1. CUDA GPU (if available) - for NVIDIA GPUs with CUDA support
if torch.cuda.is_available():
return torch.device("cuda")
# 2. MPS (Metal Performance Shaders) - for Apple Silicon Macs
if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
return torch.device("mps")
# 3. CPU - as fallback when no GPU acceleration is available
return torch.device("cpu")
GPU UDF options from daft/ai/utils.py lines 35-62:
def get_gpu_udf_options() -> UDFOptions:
"""Get UDF options for GPU-based providers."""
runner = get_or_infer_runner_type()
if runner == "native":
from daft.internal.gpu import cuda_visible_devices
num_gpus = len(cuda_visible_devices())
elif runner == "ray":
import ray
num_gpus = 0
for node in ray.nodes():
if "Resources" in node:
if "GPU" in node["Resources"] and node["Resources"]["GPU"] > 0:
num_gpus += int(node["Resources"]["GPU"])
if num_gpus > 0:
return UDFOptions(concurrency=num_gpus, num_gpus=1)
else:
return UDFOptions(concurrency=None, num_gpus=None)
Pydantic V2 requirement from daft/datatype.py lines 254-257:
if not (parse("2.0.0") <= parse(pydantic.__version__) < parse("3.0.0")):
raise ValueError(
f"Daft only supports DataType inference for Pydantic V2, found Pydantic V{pydantic.__version__}"
)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
ImportError: No module named 'openai' |
OpenAI client library not installed. | Run pip install "daft[openai]".
|
openai.AuthenticationError: Incorrect API key |
The OPENAI_API_KEY environment variable is missing or invalid. |
Set a valid API key: export OPENAI_API_KEY="sk-...".
|
ImportError: No module named 'torch' |
PyTorch not installed (needed for local transformer models). | Run pip install "daft[transformers]".
|
Daft only supports DataType inference for Pydantic V2 |
Pydantic V1 is installed instead of V2. | Upgrade Pydantic: pip install "pydantic>=2.0.0,<3.0.0".
|
CUDA out of memory |
The model exceeds available GPU VRAM. | Use a smaller model, reduce batch size, or use CPU fallback. Consider using vLLM for more efficient memory management. |
Invalid runner type: ... |
The runner type is not recognized when configuring GPU UDF options. | Ensure DAFT_RUNNER is set to either "native" or "ray".
|
Compatibility Notes
- Pillow is pinned to exactly 11.0.0 across all AI extras to avoid compatibility issues between image processing in different providers.
- numpy is constrained to < 2.4.0 for OpenAI and Google extras; the Transformers extra does not explicitly pin numpy but inherits it via torch.
- Pydantic V2 (>= 2.0.0, < 3.0.0) is strictly required for structured output support. Pydantic V1 is not supported and will raise a
ValueErrorat runtime. - vLLM is available only as a development dependency (pinned at 0.11.0) and is not included in any user-facing extras. It is used for high-throughput local inference with prefix caching.
- MPS (Apple Silicon) support depends on PyTorch's Metal backend, which may have feature gaps compared to CUDA. Not all operations are supported on MPS.
- GPU allocation behavior differs between the native and Ray runners. The native runner uses
CUDA_VISIBLE_DEVICESto determine available GPUs, while the Ray runner queries the cluster's resource metadata.