Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:PacktPublishing LLM Engineers Handbook VectorBaseDocument Search

From Leeroopedia


Field Value
Type API Doc
Workflow RAG_Inference
Repository PacktPublishing/LLM-Engineers-Handbook
Source vector.py:L138-161, retriever.py:L63-97
Implements Principle:PacktPublishing_LLM_Engineers_Handbook_Vector_Similarity_Search

API Signature

VectorBaseDocument.search(
    cls,
    query_vector: list,
    limit: int = 3,
    query_filter: Filter | None = None
) -> list[T]

Import

from llm_engineering.domain.base.vector import VectorBaseDocument

Key Code

From vector.py (the base class search method):

@classmethod
def search(cls, query_vector: list, limit: int = 3, query_filter=None) -> list:
    collection_name = cls.get_collection_name()
    qdrant_client = connection.get_qdrant_client()
    hits = qdrant_client.search(
        collection_name=collection_name,
        query_vector=query_vector,
        limit=limit,
        query_filter=query_filter,
    )
    return [cls.from_record(hit) for hit in hits]

From retriever.py (the orchestration logic that calls search across multiple collections):

The retriever embeds each expanded query using the same embedding model, constructs optional metadata filters from the self-query results, and searches across multiple document collections (posts, articles, repositories) in parallel. Results are aggregated and deduplicated before being passed to the reranker.

Parameters

Parameter Type Default Description
query_vector list[float] (required) The embedding vector for the query text
limit int 3 Maximum number of results to return per collection
query_filter Filter or None None Optional Qdrant filter for metadata-based pre-filtering

Inputs and Outputs

Inputs:

  • query_vector (list[float]) - Dense vector embedding of the query text
  • limit (int) - Maximum number of results to return
  • query_filter (Qdrant Filter) - Optional metadata filter (e.g., filtering by author_id)

Outputs:

  • list[T] - List of matching documents sorted by cosine similarity score, where T is a subclass of VectorBaseDocument (e.g., EmbeddedChunk)

How It Works

  1. The class method resolves the collection name from the document type (posts, articles, or repositories)
  2. A Qdrant client connection is obtained from the connection pool
  3. The search call is made to Qdrant with the query vector, limit, and optional filter
  4. Qdrant performs ANN search using its HNSW index, applying any metadata filters as pre-filters
  5. Raw search hits are converted to domain objects via cls.from_record(hit)
  6. Results are returned sorted by descending similarity score

External Dependencies

  • qdrant_client - Python client for the Qdrant vector database

Source Files

  • llm_engineering/domain/base/vector.py (lines 138-161)
  • llm_engineering/application/rag/retriever.py (lines 63-97)

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment