Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:AnswerDotAI RAGatouille RAGPretrainedModel Encode

From Leeroopedia
Knowledge Sources
Domains NLP, Information_Retrieval, Encoding
Last Updated 2026-02-12 12:00 GMT

Overview

Concrete tool for encoding documents into in-memory token-level embeddings for index-free search provided by the RAGatouille library.

Description

The RAGPretrainedModel.encode() method encodes documents into dense token-level embedding tensors stored in memory. It delegates to ColBERT.encode() which calls ColBERT._encode_index_free_documents() to produce embeddings via the inference checkpoint, then pads them to uniform length and stores them alongside attention masks. Supports incremental encoding (multiple calls append to existing tensors) and optional document metadata.

The delegation chain:

  • RAGPretrainedModel.encode() → validates and delegates
  • ColBERT.encode() → manages max_tokens, pads embeddings, stores/appends tensors and metadata
  • ColBERT._encode_index_free_documents() → runs inference checkpoint, returns raw embeddings and masks

Usage

Use after loading a pretrained model and before calling search_encoded_docs(). For small collections only — performance degrades rapidly with more documents.

Code Reference

Source Location

  • Repository: RAGatouille
  • File: ragatouille/RAGPretrainedModel.py
  • Lines: L359-384

Signature

def encode(
    self,
    documents: list[str],
    bsize: Union[Literal["auto"], int] = "auto",
    document_metadatas: Optional[list[dict]] = None,
    verbose: bool = True,
    max_document_length: Union[Literal["auto"], int] = "auto",
) -> None:
    """Encode documents in memory for index-free search.

    Parameters:
        documents: The documents to encode.
        bsize: Batch size ("auto" = 32, adjusted for long docs).
        document_metadatas: Optional metadata dicts per document.
        verbose: Print progress (default True).
        max_document_length: Max token length ("auto" uses 90th percentile).
    """

Import

from ragatouille import RAGPretrainedModel

I/O Contract

Inputs

Name Type Required Description
documents list[str] Yes Documents to encode in memory
bsize Union[Literal["auto"], int] No Batch size. "auto" (default) = 32, adjusted downward for long documents
document_metadatas Optional[list[dict]] No Metadata dicts, one per document
verbose bool No Print encoding progress (default True)
max_document_length Union[Literal["auto"], int] No Max token length. "auto" calculates from 90th percentile

Outputs

Name Type Description
return None Side-effects: self.model.in_memory_collection, self.model.in_memory_embed_docs (tensor), self.model.doc_masks (tensor), self.model.in_memory_metadata populated

Usage Examples

Basic Document Encoding

from ragatouille import RAGPretrainedModel

RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")

documents = [
    "ColBERT uses late interaction for retrieval.",
    "BERT is a transformer-based language model.",
    "RAGatouille simplifies ColBERT usage.",
]

RAG.encode(documents)
# Documents are now in memory, ready for search

Encoding with Metadata

RAG.encode(
    documents=["Document one.", "Document two."],
    document_metadatas=[
        {"source": "wiki"},
        {"source": "arxiv"},
    ],
)

Incremental Encoding

# First batch
RAG.encode(["First batch doc 1.", "First batch doc 2."])

# Second batch — appends to existing
RAG.encode(["Second batch doc 1.", "Second batch doc 2."])

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment