Implementation:Explodinggradients Ragas ImageTextPrompt Class

Field	Value
source	Explodinggradients_Ragas (GitHub)
domains	Prompts, Multi_Modal
last_updated	2026-02-10 00:00 GMT

Overview

The ImageTextPrompt class extends PydanticPrompt to support multi-modal prompts combining text and images, with comprehensive SSRF protection and content validation via the companion ImageTextPromptValue class.

Description

ImageTextPrompt[InputModel, OutputModel] overrides _generate_examples to produce text-only example strings and provides to_prompt_value which converts input data into an ImageTextPromptValue. The generate_multiple method handles both LangChain and Ragas LLM backends, using RagasOutputParser for structured output parsing. ImageTextPromptValue extends LangChain's PromptValue and processes items securely: it checks for base64 data URIs, validates URLs against SSRF attacks (DNS resolution to detect loopback/private/reserved IPs), downloads images with size limits and streaming, validates content with Pillow, and optionally supports local file access with path traversal protection. Security constants include ALLOWED_URL_SCHEMES, MAX_DOWNLOAD_SIZE_BYTES (10MB), REQUESTS_TIMEOUT_SECONDS (10s), and configurable flags for internal target and local file access.

Usage

Subclass ImageTextPrompt with input/output Pydantic models where the input model provides a to_string_list() method returning text and image references. Call generate() or generate_multiple() with a multi-modal capable LLM.

Code Reference

Item	Detail
Source Location	`src/ragas/prompt/multi_modal_prompt.py` L89-634
Classes	`ImageTextPrompt(PydanticPrompt, Generic[InputModel, OutputModel])`, `ImageTextPromptValue(PromptValue)`
Key Methods	`to_prompt_value()`, `generate_multiple()`, `to_messages()`, `_securely_process_item()`
Import	`from ragas.prompt import ImageTextPrompt`

I/O Contract

Inputs

Parameter	Type	Description
`llm`	`Union[BaseRagasLLM, BaseLanguageModel]`	Multi-modal language model instance
`data`	`InputModel`	Pydantic input with `to_string_list()` method returning text/image refs
`n`	`int`	Number of outputs to generate (default 1)
`temperature`	`Optional[float]`	Generation temperature
`retries_left`	`int`	Output parsing retry attempts (default 3)

Outputs

Method	Return Type	Description
`generate()`	`OutputModel`	Single parsed output model instance
`generate_multiple()`	`List[OutputModel]`	List of parsed output model instances
`ImageTextPromptValue.to_messages()`	`List[BaseMessage]`	HumanMessage with text/image content list

Usage Examples

from pydantic import BaseModel
from ragas.prompt import ImageTextPrompt

class ImageInput(BaseModel):
    question: str
    image_url: str

    def to_string_list(self):
        return [self.question, self.image_url]

class ImageOutput(BaseModel):
    description: str

class DescribeImage(ImageTextPrompt[ImageInput, ImageOutput]):
    instruction = "Describe the image and answer the question."
    input_model = ImageInput
    output_model = ImageOutput

prompt = DescribeImage()
result = await prompt.generate(
    llm=multimodal_llm,
    data=ImageInput(
        question="What objects are in this image?",
        image_url="https://example.com/photo.jpg",
    ),
)
print(result.description)

Related Pages

PydanticPrompt_Class - Parent class providing generation and parsing
BasePrompt_Class - Root of the prompt hierarchy
PromptUtils_Module - JSON extraction used during output parsing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment