Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Neuml Txtai RAG Call

From Leeroopedia


Knowledge Sources
Domains NLP, RAG
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for executing end-to-end RAG queries (search, context assembly, LLM generation, answer formatting), provided by the txtai library.

Description

The RAG.__call__ method executes the complete RAG question-answering flow. Given one or more questions, it searches the configured embeddings index for relevant context passages, assembles them into a prompt using the configured template, sends the prompt to the language model, and returns formatted answers.

The method accepts questions in multiple formats. A single string is the simplest form. A list of strings enables batch processing of multiple questions. Structured inputs as tuples (name, query, question, snippet) or dictionaries with corresponding keys allow the search query to differ from the displayed question and propagate identifiers through to the output. When a list of dictionaries is provided, the output mirrors the dictionary format.

An optional texts parameter allows the caller to bypass the embeddings search entirely and supply explicit context passages. This is useful for controlled experiments, testing, or scenarios where the context is already known. When texts is provided, the passages are still scored for relevance and ordered accordingly, but no index search is performed.

The output format is determined by the output parameter set during initialization. The default format returns (name, answer) tuples. The flatten format returns plain answer strings. The reference format returns (name, answer, reference) tuples where the reference is the ID of the context passage that best matches the generated answer, enabling source attribution. For single-question inputs, the method returns a single result rather than a list.

Usage

Use RAG.__call__ when you need to:

  • Answer a single question using context retrieved from an embeddings index.
  • Batch-process multiple questions in one call for efficient inference.
  • Override retrieved context with explicit text passages for controlled generation.
  • Obtain source references alongside answers for provenance tracking.

Code Reference

Source Location

  • Repository: txtai
  • File: src/python/txtai/pipeline/llm/rag.py
  • Lines: L93-145

Signature

def __call__(self, queue, texts=None, **kwargs):
    """
    Finds answers to input questions.

    Args:
        queue: input question queue (name, query, question, snippet),
               can be list of tuples/dicts/strings or a single input element
        texts: optional list of text for context, otherwise runs embeddings search
        kwargs: additional keyword arguments to pass to pipeline model

    Returns:
        list of answers matching input format (tuple or dict)
        containing fields as specified by output format
    """
    ...

Import

from txtai.pipeline import RAG

I/O Contract

Inputs

Name Type Required Description
queue str, list[str], list[tuple], or list[dict] Yes Question(s) to answer. Strings are converted to (None, query, query, None) tuples internally. Dicts must have keys from: name, query, question, snippet.
texts list[str] or None No Explicit context passages to use instead of searching the embeddings index. Default: None (uses index search).
**kwargs keyword args No Additional arguments passed through to the underlying LLM pipeline (e.g., maxlength, temperature).

Outputs

Name Type Description
answers (default) tuple(name, answer) or list[tuple] Name-answer pairs. Returns a single tuple for single-question input, a list for batch input.
answers (flatten) str or list[str] Plain answer strings with names stripped.
answers (reference) tuple(name, answer, reference) or list[tuple] Name-answer-reference triples where reference is the ID of the best-matching context passage.
answers (dict input) dict or list[dict] When input is dict format, output mirrors dict format with answer key (and optionally name, reference).

Usage Examples

Basic Example: Single Question

from txtai.embeddings import Embeddings
from txtai.pipeline import RAG

# Build index
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "content": True
})
embeddings.index([
    "Python was created by Guido van Rossum.",
    "Python was first released in 1991.",
    "Python emphasizes code readability.",
])

# Initialize RAG
rag = RAG(
    similarity=embeddings,
    path="google/flan-t5-base",
    template="Context: {context}\nQuestion: {question}\nAnswer:",
    context=3
)

# Execute a single question
answer = rag("Who created Python?")
# Returns: (None, "Guido van Rossum")
print(answer)

Batch Question Processing

# Process multiple questions in one call
questions = [
    "Who created Python?",
    "When was Python released?",
    "What does Python emphasize?",
]

answers = rag(questions)
# Returns: [(None, "Guido van Rossum"), (None, "1991"), (None, "code readability")]
for name, answer in answers:
    print(f"Answer: {answer}")

Using Explicit Context (Bypassing Search)

# Provide explicit context instead of using index search
context_passages = [
    "The Eiffel Tower is 330 meters tall.",
    "The Eiffel Tower was completed in 1889.",
    "The Eiffel Tower is in Paris, France.",
]

answer = rag(
    "How tall is the Eiffel Tower?",
    texts=context_passages
)
print(answer)

Structured Input with Named Questions

# Use tuple format: (name, query, question, snippet)
structured_questions = [
    ("q1", "Python creator", "Who created the Python programming language?", None),
    ("q2", "Python release date", "When was Python first released?", None),
]

answers = rag(structured_questions)
# Returns: [("q1", "Guido van Rossum"), ("q2", "1991")]
for name, answer in answers:
    print(f"{name}: {answer}")

Dictionary Input Format

# Use dict format for named fields
dict_questions = [
    {"name": "q1", "query": "Python creator", "question": "Who created Python?"},
    {"name": "q2", "query": "Python release", "question": "When was Python released?"},
]

answers = rag(dict_questions)
# Returns: [{"name": "q1", "answer": "Guido van Rossum"}, {"name": "q2", "answer": "1991"}]
for row in answers:
    print(f"{row['name']}: {row['answer']}")

Reference Output with Source Attribution

# Initialize RAG with reference output
rag_ref = RAG(
    similarity=embeddings,
    path="google/flan-t5-base",
    template="Context: {context}\nQuestion: {question}\nAnswer:",
    output="reference",
    context=3
)

answer = rag_ref("Who created Python?")
# Returns: (None, "Guido van Rossum", 0)
# The third element (0) is the ID of the source passage
name, text, reference_id = answer
print(f"Answer: {text} (source document ID: {reference_id})")

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment