Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas ImageTextPrompt Class

From Leeroopedia
Revision as of 14:54, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Explodinggradients_Ragas_ImageTextPrompt_Class.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Field Value
source Explodinggradients_Ragas (GitHub)
domains Prompts, Multi_Modal
last_updated 2026-02-10 00:00 GMT

Overview

The ImageTextPrompt class extends PydanticPrompt to support multi-modal prompts combining text and images, with comprehensive SSRF protection and content validation via the companion ImageTextPromptValue class.

Description

ImageTextPrompt[InputModel, OutputModel] overrides _generate_examples to produce text-only example strings and provides to_prompt_value which converts input data into an ImageTextPromptValue. The generate_multiple method handles both LangChain and Ragas LLM backends, using RagasOutputParser for structured output parsing. ImageTextPromptValue extends LangChain's PromptValue and processes items securely: it checks for base64 data URIs, validates URLs against SSRF attacks (DNS resolution to detect loopback/private/reserved IPs), downloads images with size limits and streaming, validates content with Pillow, and optionally supports local file access with path traversal protection. Security constants include ALLOWED_URL_SCHEMES, MAX_DOWNLOAD_SIZE_BYTES (10MB), REQUESTS_TIMEOUT_SECONDS (10s), and configurable flags for internal target and local file access.

Usage

Subclass ImageTextPrompt with input/output Pydantic models where the input model provides a to_string_list() method returning text and image references. Call generate() or generate_multiple() with a multi-modal capable LLM.

Code Reference

Item Detail
Source Location src/ragas/prompt/multi_modal_prompt.py L89-634
Classes ImageTextPrompt(PydanticPrompt, Generic[InputModel, OutputModel]), ImageTextPromptValue(PromptValue)
Key Methods to_prompt_value(), generate_multiple(), to_messages(), _securely_process_item()
Import from ragas.prompt import ImageTextPrompt

I/O Contract

Inputs

Parameter Type Description
llm Union[BaseRagasLLM, BaseLanguageModel] Multi-modal language model instance
data InputModel Pydantic input with to_string_list() method returning text/image refs
n int Number of outputs to generate (default 1)
temperature Optional[float] Generation temperature
retries_left int Output parsing retry attempts (default 3)

Outputs

Method Return Type Description
generate() OutputModel Single parsed output model instance
generate_multiple() List[OutputModel] List of parsed output model instances
ImageTextPromptValue.to_messages() List[BaseMessage] HumanMessage with text/image content list

Usage Examples

from pydantic import BaseModel
from ragas.prompt import ImageTextPrompt

class ImageInput(BaseModel):
    question: str
    image_url: str

    def to_string_list(self):
        return [self.question, self.image_url]

class ImageOutput(BaseModel):
    description: str

class DescribeImage(ImageTextPrompt[ImageInput, ImageOutput]):
    instruction = "Describe the image and answer the question."
    input_model = ImageInput
    output_model = ImageOutput

prompt = DescribeImage()
result = await prompt.generate(
    llm=multimodal_llm,
    data=ImageInput(
        question="What objects are in this image?",
        image_url="https://example.com/photo.jpg",
    ),
)
print(result.description)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment