Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:AnswerDotAI RAGatouille Semantic Search

From Leeroopedia
Knowledge Sources
Domains NLP, Information_Retrieval, Search
Last Updated 2026-02-12 12:00 GMT

Overview

A retrieval mechanism that finds the most relevant passages in a pre-built PLAID index by encoding a query into token-level embeddings and computing late-interaction MaxSim scores against indexed document representations.

Description

Semantic Search in the ColBERT framework operates on a pre-built PLAID index. Given a query string (or batch of queries), the system encodes the query into token-level embeddings, then uses the PLAID engine to efficiently retrieve the top-k most relevant passages. The PLAID search algorithm uses centroid interaction to prune the candidate set before performing full late-interaction scoring on the remaining candidates.

The search pipeline involves:

  • Query encoding into token-level embeddings via the ColBERT checkpoint
  • Centroid-based candidate generation using the inverted index
  • Decompression of candidate document residuals
  • Full MaxSim scoring between query and candidate token embeddings
  • Result formatting with content, scores, ranks, document IDs, and optional metadata

Usage

Use this principle after building or loading an index. This is the primary online retrieval mechanism for:

  • Answering user queries against an indexed document collection
  • Providing context for RAG (Retrieval-Augmented Generation) pipelines
  • Batch search across multiple queries simultaneously
  • Filtered search restricted to specific document IDs

Theoretical Basis

ColBERT search computes relevance via the MaxSim operator:

S(q,d)=i=1|q|maxj=1|d|EqiEdjT

PLAID accelerates this by:

1. Centroid Interaction: Compute approximate scores using only centroid representations to prune the candidate set.

2. Candidate Refinement: For top candidates, decompress the quantized residuals and compute exact MaxSim scores.

3. Configurable Precision: Parameters like ncells (number of centroids to probe) and ndocs (candidate pool size) trade off between speed and recall.

The searcher dynamically adapts these parameters based on collection size:

  • <10k documents: ncells=8, centroid_score_threshold=0.4
  • 10k-100k documents: ncells=4, centroid_score_threshold=0.45
  • >100k documents: default settings

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment