Implementation:Run llama Llama index MultiModalEmbedding
| Knowledge Sources | |
|---|---|
| Domains | Embeddings, MultiModal |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
Defines the abstract base class for multi-modal embedding models that can generate embeddings for both text and images.
Description
The MultiModalEmbedding class extends BaseEmbedding to add image embedding capabilities alongside the inherited text embedding functionality. It is an abstract base class that requires subclasses to implement two core abstract methods: _get_image_embedding (synchronous) and _aget_image_embedding (asynchronous), both of which accept an ImageType (a file path or URL) and return an Embedding (a list of floats).
The class provides the following public methods:
- get_image_embedding and aget_image_embedding wrap the abstract methods with callback manager integration, emitting CBEventType.EMBEDDING events with serialized payloads for observability.
- _get_image_embeddings and _aget_image_embeddings provide default batch implementations: the synchronous version loops over individual embeddings while the async version uses asyncio.gather for concurrent execution.
- get_image_embedding_batch processes a list of image paths in batches controlled by embed_batch_size (inherited from BaseEmbedding), with optional tqdm progress bar support. Each batch is wrapped in a callback event.
- aget_image_embedding_batch is the asynchronous counterpart that gathers all batch coroutines concurrently and supports tqdm.asyncio for async progress tracking.
Usage
Use this class as a base when implementing a multi-modal embedding provider (e.g., CLIP, OpenAI multi-modal embeddings). Subclass it and implement _get_image_embedding and _aget_image_embedding for your specific model. The resulting embedding model can then be used in multi-modal retrieval pipelines to embed both documents and images into a shared vector space.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/embeddings/multi_modal_base.py
Signature
class MultiModalEmbedding(BaseEmbedding):
"""Base class for Multi Modal embeddings."""
@abstractmethod
def _get_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
@abstractmethod
async def _aget_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
def get_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
async def aget_image_embedding(self, img_file_path: ImageType) -> Embedding: ...
def get_image_embedding_batch(
self, img_file_paths: List[ImageType], show_progress: bool = False
) -> List[Embedding]: ...
async def aget_image_embedding_batch(
self, img_file_paths: List[ImageType], show_progress: bool = False
) -> List[Embedding]: ...
Import
from llama_index.core.embeddings.multi_modal_base import MultiModalEmbedding
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| img_file_path | ImageType | Yes | A file path or URL to an image (for single-image methods). |
| img_file_paths | List[ImageType] | Yes | A list of image file paths or URLs (for batch methods). |
| show_progress | bool | No | If True, displays a tqdm progress bar during batch embedding. Defaults to False. |
Outputs
| Name | Type | Description |
|---|---|---|
| embedding | Embedding (List[float]) | The embedding vector for a single image. |
| embeddings | List[Embedding] | A list of embedding vectors for a batch of images. |
Usage Examples
from llama_index.core.embeddings.multi_modal_base import MultiModalEmbedding
from llama_index.core.schema import ImageType
from llama_index.core.base.embeddings.base import Embedding
# Subclass to implement a custom multi-modal embedding
class MyMultiModalEmbedding(MultiModalEmbedding):
def _get_text_embedding(self, text: str) -> Embedding:
# Implement text embedding logic
...
def _get_query_embedding(self, query: str) -> Embedding:
# Implement query embedding logic
...
def _get_image_embedding(self, img_file_path: ImageType) -> Embedding:
# Implement image embedding logic
...
async def _aget_image_embedding(self, img_file_path: ImageType) -> Embedding:
# Implement async image embedding logic
...
# Use the embedding model
embed_model = MyMultiModalEmbedding()
image_embedding = embed_model.get_image_embedding("path/to/image.png")
batch_embeddings = embed_model.get_image_embedding_batch(
["img1.png", "img2.png"], show_progress=True
)