Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index MultiModalEmbedding

From Leeroopedia
Knowledge Sources
Domains Embeddings, MultiModal
Last Updated 2026-02-11 19:00 GMT

Overview

Defines the abstract base class for multi-modal embedding models that can generate embeddings for both text and images.

Description

The MultiModalEmbedding class extends BaseEmbedding to add image embedding capabilities alongside the inherited text embedding functionality. It is an abstract base class that requires subclasses to implement two core abstract methods: _get_image_embedding (synchronous) and _aget_image_embedding (asynchronous), both of which accept an ImageType (a file path or URL) and return an Embedding (a list of floats).

The class provides the following public methods:

  • get_image_embedding and aget_image_embedding wrap the abstract methods with callback manager integration, emitting CBEventType.EMBEDDING events with serialized payloads for observability.
  • _get_image_embeddings and _aget_image_embeddings provide default batch implementations: the synchronous version loops over individual embeddings while the async version uses asyncio.gather for concurrent execution.
  • get_image_embedding_batch processes a list of image paths in batches controlled by embed_batch_size (inherited from BaseEmbedding), with optional tqdm progress bar support. Each batch is wrapped in a callback event.
  • aget_image_embedding_batch is the asynchronous counterpart that gathers all batch coroutines concurrently and supports tqdm.asyncio for async progress tracking.

Usage

Use this class as a base when implementing a multi-modal embedding provider (e.g., CLIP, OpenAI multi-modal embeddings). Subclass it and implement _get_image_embedding and _aget_image_embedding for your specific model. The resulting embedding model can then be used in multi-modal retrieval pipelines to embed both documents and images into a shared vector space.

Code Reference

Source Location

Signature

class MultiModalEmbedding(BaseEmbedding):
    """Base class for Multi Modal embeddings."""

    @abstractmethod
    def _get_image_embedding(self, img_file_path: ImageType) -> Embedding: ...

    @abstractmethod
    async def _aget_image_embedding(self, img_file_path: ImageType) -> Embedding: ...

    def get_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
    async def aget_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
    def get_image_embedding_batch(
        self, img_file_paths: List[ImageType], show_progress: bool = False
    ) -> List[Embedding]: ...
    async def aget_image_embedding_batch(
        self, img_file_paths: List[ImageType], show_progress: bool = False
    ) -> List[Embedding]: ...

Import

from llama_index.core.embeddings.multi_modal_base import MultiModalEmbedding

I/O Contract

Inputs

Name Type Required Description
img_file_path ImageType Yes A file path or URL to an image (for single-image methods).
img_file_paths List[ImageType] Yes A list of image file paths or URLs (for batch methods).
show_progress bool No If True, displays a tqdm progress bar during batch embedding. Defaults to False.

Outputs

Name Type Description
embedding Embedding (List[float]) The embedding vector for a single image.
embeddings List[Embedding] A list of embedding vectors for a batch of images.

Usage Examples

from llama_index.core.embeddings.multi_modal_base import MultiModalEmbedding
from llama_index.core.schema import ImageType
from llama_index.core.base.embeddings.base import Embedding

# Subclass to implement a custom multi-modal embedding
class MyMultiModalEmbedding(MultiModalEmbedding):
    def _get_text_embedding(self, text: str) -> Embedding:
        # Implement text embedding logic
        ...

    def _get_query_embedding(self, query: str) -> Embedding:
        # Implement query embedding logic
        ...

    def _get_image_embedding(self, img_file_path: ImageType) -> Embedding:
        # Implement image embedding logic
        ...

    async def _aget_image_embedding(self, img_file_path: ImageType) -> Embedding:
        # Implement async image embedding logic
        ...

# Use the embedding model
embed_model = MyMultiModalEmbedding()
image_embedding = embed_model.get_image_embedding("path/to/image.png")
batch_embeddings = embed_model.get_image_embedding_batch(
    ["img1.png", "img2.png"], show_progress=True
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment