Implementation:Explodinggradients Ragas ImageTextPrompt Class
| Field | Value |
|---|---|
| source | Explodinggradients_Ragas (GitHub) |
| domains | Prompts, Multi_Modal |
| last_updated | 2026-02-10 00:00 GMT |
Overview
The ImageTextPrompt class extends PydanticPrompt to support multi-modal prompts combining text and images, with comprehensive SSRF protection and content validation via the companion ImageTextPromptValue class.
Description
ImageTextPrompt[InputModel, OutputModel] overrides _generate_examples to produce text-only example strings and provides to_prompt_value which converts input data into an ImageTextPromptValue. The generate_multiple method handles both LangChain and Ragas LLM backends, using RagasOutputParser for structured output parsing. ImageTextPromptValue extends LangChain's PromptValue and processes items securely: it checks for base64 data URIs, validates URLs against SSRF attacks (DNS resolution to detect loopback/private/reserved IPs), downloads images with size limits and streaming, validates content with Pillow, and optionally supports local file access with path traversal protection. Security constants include ALLOWED_URL_SCHEMES, MAX_DOWNLOAD_SIZE_BYTES (10MB), REQUESTS_TIMEOUT_SECONDS (10s), and configurable flags for internal target and local file access.
Usage
Subclass ImageTextPrompt with input/output Pydantic models where the input model provides a to_string_list() method returning text and image references. Call generate() or generate_multiple() with a multi-modal capable LLM.
Code Reference
| Item | Detail |
|---|---|
| Source Location | src/ragas/prompt/multi_modal_prompt.py L89-634
|
| Classes | ImageTextPrompt(PydanticPrompt, Generic[InputModel, OutputModel]), ImageTextPromptValue(PromptValue)
|
| Key Methods | to_prompt_value(), generate_multiple(), to_messages(), _securely_process_item()
|
| Import | from ragas.prompt import ImageTextPrompt
|
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
llm |
Union[BaseRagasLLM, BaseLanguageModel] |
Multi-modal language model instance |
data |
InputModel |
Pydantic input with to_string_list() method returning text/image refs
|
n |
int |
Number of outputs to generate (default 1) |
temperature |
Optional[float] |
Generation temperature |
retries_left |
int |
Output parsing retry attempts (default 3) |
Outputs
| Method | Return Type | Description |
|---|---|---|
generate() |
OutputModel |
Single parsed output model instance |
generate_multiple() |
List[OutputModel] |
List of parsed output model instances |
ImageTextPromptValue.to_messages() |
List[BaseMessage] |
HumanMessage with text/image content list |
Usage Examples
from pydantic import BaseModel
from ragas.prompt import ImageTextPrompt
class ImageInput(BaseModel):
question: str
image_url: str
def to_string_list(self):
return [self.question, self.image_url]
class ImageOutput(BaseModel):
description: str
class DescribeImage(ImageTextPrompt[ImageInput, ImageOutput]):
instruction = "Describe the image and answer the question."
input_model = ImageInput
output_model = ImageOutput
prompt = DescribeImage()
result = await prompt.generate(
llm=multimodal_llm,
data=ImageInput(
question="What objects are in this image?",
image_url="https://example.com/photo.jpg",
),
)
print(result.description)
Related Pages
- PydanticPrompt_Class - Parent class providing generation and parsing
- BasePrompt_Class - Root of the prompt hierarchy
- PromptUtils_Module - JSON extraction used during output parsing