Implementation:Eventual Inc Daft AI Embed Image

Knowledge Sources	Daft Daft Docs
Domains	Data_Engineering, Computer_Vision
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for computing image embeddings on DataFrame columns provided by the Daft library.

Description

The embed_image function returns an expression that embeds images using a specified vision model and provider. It supports both local model inference (via the transformers provider with models like apple/aimv2-large-patch14-224-lit or CLIP variants) and remote API-based embedding. The function automatically selects between synchronous and asynchronous execution based on the provider, and supports GPU allocation, batch processing, and configurable concurrency.

Usage

Import and use this function when you need to compute dense vector embeddings of image data for visual search, image similarity, or multimodal applications.

Code Reference

Source Location

Repository: Daft
File: daft/functions/ai/__init__.py
Lines: L157-242

Signature

def embed_image(
    image: Expression,
    *,
    provider: str | Provider | None = None,
    model: str | None = None,
    **options: Unpack[EmbedImageOptions],
) -> Expression

Import

from daft.functions.ai import embed_image

# or
from daft.functions import embed_image

I/O Contract

Inputs

Name	Type	Required	Description
image	Expression (Image)	Yes	The input Image column expression to embed. Images should be decoded and optionally resized/converted to RGB beforehand.
provider	Provider \| None	No	The embedding provider (e.g., `"transformers"`, `"openai"`). Defaults to `"transformers"` when not specified.
model	None	No	The vision embedding model name (e.g., `"apple/aimv2-large-patch14-224-lit"`). If `None`, the provider's default model is used.
**options	EmbedImageOptions	No	Additional provider-specific options (e.g., batch_size, concurrency).

Outputs

Name	Type	Description
return	Expression (FixedSizeList[Float32])	An Embedding expression containing fixed-size float vectors representing the image embeddings.

Usage Examples

Basic Usage

import daft
from daft.functions import decode_image, embed_image

df = (
    daft.from_glob_path("hf://datasets/datasets-examples/doc-image-3/images")
    .with_column("image_bytes", daft.col("path").download())
    .with_column("image", decode_image(daft.col("image_bytes")))
    .with_column("image_rgb", daft.col("image").convert_image("RGB").resize(288, 288))
    .with_column(
        "embeddings",
        embed_image(
            daft.col("image_rgb"),
            provider="transformers",
            model="apple/aimv2-large-patch14-224-lit",
        ),
    )
)
df.show()

Related Pages

Implements Principle

Principle:Eventual_Inc_Daft_AI_Image_Embedding

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment