Implementation:Neuml Txtai RAG Call
| Knowledge Sources | |
|---|---|
| Domains | NLP, RAG |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for executing end-to-end RAG queries (search, context assembly, LLM generation, answer formatting), provided by the txtai library.
Description
The RAG.__call__ method executes the complete RAG question-answering flow. Given one or more questions, it searches the configured embeddings index for relevant context passages, assembles them into a prompt using the configured template, sends the prompt to the language model, and returns formatted answers.
The method accepts questions in multiple formats. A single string is the simplest form. A list of strings enables batch processing of multiple questions. Structured inputs as tuples (name, query, question, snippet) or dictionaries with corresponding keys allow the search query to differ from the displayed question and propagate identifiers through to the output. When a list of dictionaries is provided, the output mirrors the dictionary format.
An optional texts parameter allows the caller to bypass the embeddings search entirely and supply explicit context passages. This is useful for controlled experiments, testing, or scenarios where the context is already known. When texts is provided, the passages are still scored for relevance and ordered accordingly, but no index search is performed.
The output format is determined by the output parameter set during initialization. The default format returns (name, answer) tuples. The flatten format returns plain answer strings. The reference format returns (name, answer, reference) tuples where the reference is the ID of the context passage that best matches the generated answer, enabling source attribution. For single-question inputs, the method returns a single result rather than a list.
Usage
Use RAG.__call__ when you need to:
- Answer a single question using context retrieved from an embeddings index.
- Batch-process multiple questions in one call for efficient inference.
- Override retrieved context with explicit text passages for controlled generation.
- Obtain source references alongside answers for provenance tracking.
Code Reference
Source Location
- Repository: txtai
- File:
src/python/txtai/pipeline/llm/rag.py - Lines: L93-145
Signature
def __call__(self, queue, texts=None, **kwargs):
"""
Finds answers to input questions.
Args:
queue: input question queue (name, query, question, snippet),
can be list of tuples/dicts/strings or a single input element
texts: optional list of text for context, otherwise runs embeddings search
kwargs: additional keyword arguments to pass to pipeline model
Returns:
list of answers matching input format (tuple or dict)
containing fields as specified by output format
"""
...
Import
from txtai.pipeline import RAG
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| queue | str, list[str], list[tuple], or list[dict] |
Yes | Question(s) to answer. Strings are converted to (None, query, query, None) tuples internally. Dicts must have keys from: name, query, question, snippet.
|
| texts | list[str] or None |
No | Explicit context passages to use instead of searching the embeddings index. Default: None (uses index search).
|
| **kwargs | keyword args | No | Additional arguments passed through to the underlying LLM pipeline (e.g., maxlength, temperature).
|
Outputs
| Name | Type | Description |
|---|---|---|
| answers (default) | tuple(name, answer) or list[tuple] |
Name-answer pairs. Returns a single tuple for single-question input, a list for batch input. |
| answers (flatten) | str or list[str] |
Plain answer strings with names stripped. |
| answers (reference) | tuple(name, answer, reference) or list[tuple] |
Name-answer-reference triples where reference is the ID of the best-matching context passage. |
| answers (dict input) | dict or list[dict] |
When input is dict format, output mirrors dict format with answer key (and optionally name, reference).
|
Usage Examples
Basic Example: Single Question
from txtai.embeddings import Embeddings
from txtai.pipeline import RAG
# Build index
embeddings = Embeddings({
"path": "sentence-transformers/all-MiniLM-L6-v2",
"content": True
})
embeddings.index([
"Python was created by Guido van Rossum.",
"Python was first released in 1991.",
"Python emphasizes code readability.",
])
# Initialize RAG
rag = RAG(
similarity=embeddings,
path="google/flan-t5-base",
template="Context: {context}\nQuestion: {question}\nAnswer:",
context=3
)
# Execute a single question
answer = rag("Who created Python?")
# Returns: (None, "Guido van Rossum")
print(answer)
Batch Question Processing
# Process multiple questions in one call
questions = [
"Who created Python?",
"When was Python released?",
"What does Python emphasize?",
]
answers = rag(questions)
# Returns: [(None, "Guido van Rossum"), (None, "1991"), (None, "code readability")]
for name, answer in answers:
print(f"Answer: {answer}")
Using Explicit Context (Bypassing Search)
# Provide explicit context instead of using index search
context_passages = [
"The Eiffel Tower is 330 meters tall.",
"The Eiffel Tower was completed in 1889.",
"The Eiffel Tower is in Paris, France.",
]
answer = rag(
"How tall is the Eiffel Tower?",
texts=context_passages
)
print(answer)
Structured Input with Named Questions
# Use tuple format: (name, query, question, snippet)
structured_questions = [
("q1", "Python creator", "Who created the Python programming language?", None),
("q2", "Python release date", "When was Python first released?", None),
]
answers = rag(structured_questions)
# Returns: [("q1", "Guido van Rossum"), ("q2", "1991")]
for name, answer in answers:
print(f"{name}: {answer}")
Dictionary Input Format
# Use dict format for named fields
dict_questions = [
{"name": "q1", "query": "Python creator", "question": "Who created Python?"},
{"name": "q2", "query": "Python release", "question": "When was Python released?"},
]
answers = rag(dict_questions)
# Returns: [{"name": "q1", "answer": "Guido van Rossum"}, {"name": "q2", "answer": "1991"}]
for row in answers:
print(f"{row['name']}: {row['answer']}")
Reference Output with Source Attribution
# Initialize RAG with reference output
rag_ref = RAG(
similarity=embeddings,
path="google/flan-t5-base",
template="Context: {context}\nQuestion: {question}\nAnswer:",
output="reference",
context=3
)
answer = rag_ref("Who created Python?")
# Returns: (None, "Guido van Rossum", 0)
# The third element (0) is the ID of the source passage
name, text, reference_id = answer
print(f"Answer: {text} (source document ID: {reference_id})")