Implementation:AnswerDotAI RAGatouille RAGPretrainedModel Encode
| Knowledge Sources | |
|---|---|
| Domains | NLP, Information_Retrieval, Encoding |
| Last Updated | 2026-02-12 12:00 GMT |
Overview
Concrete tool for encoding documents into in-memory token-level embeddings for index-free search provided by the RAGatouille library.
Description
The RAGPretrainedModel.encode() method encodes documents into dense token-level embedding tensors stored in memory. It delegates to ColBERT.encode() which calls ColBERT._encode_index_free_documents() to produce embeddings via the inference checkpoint, then pads them to uniform length and stores them alongside attention masks. Supports incremental encoding (multiple calls append to existing tensors) and optional document metadata.
The delegation chain:
- RAGPretrainedModel.encode() → validates and delegates
- ColBERT.encode() → manages max_tokens, pads embeddings, stores/appends tensors and metadata
- ColBERT._encode_index_free_documents() → runs inference checkpoint, returns raw embeddings and masks
Usage
Use after loading a pretrained model and before calling search_encoded_docs(). For small collections only — performance degrades rapidly with more documents.
Code Reference
Source Location
- Repository: RAGatouille
- File: ragatouille/RAGPretrainedModel.py
- Lines: L359-384
Signature
def encode(
self,
documents: list[str],
bsize: Union[Literal["auto"], int] = "auto",
document_metadatas: Optional[list[dict]] = None,
verbose: bool = True,
max_document_length: Union[Literal["auto"], int] = "auto",
) -> None:
"""Encode documents in memory for index-free search.
Parameters:
documents: The documents to encode.
bsize: Batch size ("auto" = 32, adjusted for long docs).
document_metadatas: Optional metadata dicts per document.
verbose: Print progress (default True).
max_document_length: Max token length ("auto" uses 90th percentile).
"""
Import
from ragatouille import RAGPretrainedModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| documents | list[str] | Yes | Documents to encode in memory |
| bsize | Union[Literal["auto"], int] | No | Batch size. "auto" (default) = 32, adjusted downward for long documents |
| document_metadatas | Optional[list[dict]] | No | Metadata dicts, one per document |
| verbose | bool | No | Print encoding progress (default True) |
| max_document_length | Union[Literal["auto"], int] | No | Max token length. "auto" calculates from 90th percentile |
Outputs
| Name | Type | Description |
|---|---|---|
| return | None | Side-effects: self.model.in_memory_collection, self.model.in_memory_embed_docs (tensor), self.model.doc_masks (tensor), self.model.in_memory_metadata populated |
Usage Examples
Basic Document Encoding
from ragatouille import RAGPretrainedModel
RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")
documents = [
"ColBERT uses late interaction for retrieval.",
"BERT is a transformer-based language model.",
"RAGatouille simplifies ColBERT usage.",
]
RAG.encode(documents)
# Documents are now in memory, ready for search
Encoding with Metadata
RAG.encode(
documents=["Document one.", "Document two."],
document_metadatas=[
{"source": "wiki"},
{"source": "arxiv"},
],
)
Incremental Encoding
# First batch
RAG.encode(["First batch doc 1.", "First batch doc 2."])
# Second batch — appends to existing
RAG.encode(["Second batch doc 1.", "Second batch doc 2."])
Related Pages
Implements Principle
Requires Environment
- Environment:AnswerDotAI_RAGatouille_Python_ColBERT_Dependencies
- Environment:AnswerDotAI_RAGatouille_GPU_CUDA_Runtime