Implementation:AnswerDotAI RAGatouille RAGPretrainedModel Encode

Knowledge Sources	RAGatouille RAGatouille Docs
Domains	NLP, Information_Retrieval, Encoding
Last Updated	2026-02-12 12:00 GMT

Overview

Concrete tool for encoding documents into in-memory token-level embeddings for index-free search provided by the RAGatouille library.

Description

The RAGPretrainedModel.encode() method encodes documents into dense token-level embedding tensors stored in memory. It delegates to ColBERT.encode() which calls ColBERT._encode_index_free_documents() to produce embeddings via the inference checkpoint, then pads them to uniform length and stores them alongside attention masks. Supports incremental encoding (multiple calls append to existing tensors) and optional document metadata.

The delegation chain:

RAGPretrainedModel.encode() → validates and delegates
ColBERT.encode() → manages max_tokens, pads embeddings, stores/appends tensors and metadata
ColBERT._encode_index_free_documents() → runs inference checkpoint, returns raw embeddings and masks

Usage

Use after loading a pretrained model and before calling search_encoded_docs(). For small collections only — performance degrades rapidly with more documents.

Code Reference

Source Location

Repository: RAGatouille
File: ragatouille/RAGPretrainedModel.py
Lines: L359-384

Signature

def encode(
    self,
    documents: list[str],
    bsize: Union[Literal["auto"], int] = "auto",
    document_metadatas: Optional[list[dict]] = None,
    verbose: bool = True,
    max_document_length: Union[Literal["auto"], int] = "auto",
) -> None:
    """Encode documents in memory for index-free search.

    Parameters:
        documents: The documents to encode.
        bsize: Batch size ("auto" = 32, adjusted for long docs).
        document_metadatas: Optional metadata dicts per document.
        verbose: Print progress (default True).
        max_document_length: Max token length ("auto" uses 90th percentile).
    """

Import

from ragatouille import RAGPretrainedModel

I/O Contract

Inputs

Name	Type	Required	Description
documents	list[str]	Yes	Documents to encode in memory
bsize	Union[Literal["auto"], int]	No	Batch size. "auto" (default) = 32, adjusted downward for long documents
document_metadatas	Optional[list[dict]]	No	Metadata dicts, one per document
verbose	bool	No	Print encoding progress (default True)
max_document_length	Union[Literal["auto"], int]	No	Max token length. "auto" calculates from 90th percentile

Outputs

Name	Type	Description
return	None	Side-effects: self.model.in_memory_collection, self.model.in_memory_embed_docs (tensor), self.model.doc_masks (tensor), self.model.in_memory_metadata populated

Usage Examples

Basic Document Encoding

from ragatouille import RAGPretrainedModel

RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")

documents = [
    "ColBERT uses late interaction for retrieval.",
    "BERT is a transformer-based language model.",
    "RAGatouille simplifies ColBERT usage.",
]

RAG.encode(documents)
# Documents are now in memory, ready for search

Encoding with Metadata

RAG.encode(
    documents=["Document one.", "Document two."],
    document_metadatas=[
        {"source": "wiki"},
        {"source": "arxiv"},
    ],
)

Incremental Encoding

# First batch
RAG.encode(["First batch doc 1.", "First batch doc 2."])

# Second batch — appends to existing
RAG.encode(["Second batch doc 1.", "Second batch doc 2."])

Related Pages

Implements Principle

Principle:AnswerDotAI_RAGatouille_In_Memory_Document_Encoding

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment