Implementation:PacktPublishing LLM Engineers Handbook VectorBaseDocument Search
Appearance
| Field | Value |
|---|---|
| Type | API Doc |
| Workflow | RAG_Inference |
| Repository | PacktPublishing/LLM-Engineers-Handbook |
| Source | vector.py:L138-161, retriever.py:L63-97 |
| Implements | Principle:PacktPublishing_LLM_Engineers_Handbook_Vector_Similarity_Search |
API Signature
VectorBaseDocument.search(
cls,
query_vector: list,
limit: int = 3,
query_filter: Filter | None = None
) -> list[T]
Import
from llm_engineering.domain.base.vector import VectorBaseDocument
Key Code
From vector.py (the base class search method):
@classmethod
def search(cls, query_vector: list, limit: int = 3, query_filter=None) -> list:
collection_name = cls.get_collection_name()
qdrant_client = connection.get_qdrant_client()
hits = qdrant_client.search(
collection_name=collection_name,
query_vector=query_vector,
limit=limit,
query_filter=query_filter,
)
return [cls.from_record(hit) for hit in hits]
From retriever.py (the orchestration logic that calls search across multiple collections):
The retriever embeds each expanded query using the same embedding model, constructs optional metadata filters from the self-query results, and searches across multiple document collections (posts, articles, repositories) in parallel. Results are aggregated and deduplicated before being passed to the reranker.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| query_vector | list[float] | (required) | The embedding vector for the query text |
| limit | int | 3 | Maximum number of results to return per collection |
| query_filter | Filter or None | None | Optional Qdrant filter for metadata-based pre-filtering |
Inputs and Outputs
Inputs:
- query_vector (list[float]) - Dense vector embedding of the query text
- limit (int) - Maximum number of results to return
- query_filter (Qdrant Filter) - Optional metadata filter (e.g., filtering by author_id)
Outputs:
- list[T] - List of matching documents sorted by cosine similarity score, where T is a subclass of
VectorBaseDocument(e.g.,EmbeddedChunk)
How It Works
- The class method resolves the collection name from the document type (posts, articles, or repositories)
- A Qdrant client connection is obtained from the connection pool
- The search call is made to Qdrant with the query vector, limit, and optional filter
- Qdrant performs ANN search using its HNSW index, applying any metadata filters as pre-filters
- Raw search hits are converted to domain objects via
cls.from_record(hit) - Results are returned sorted by descending similarity score
External Dependencies
- qdrant_client - Python client for the Qdrant vector database
Source Files
llm_engineering/domain/base/vector.py(lines 138-161)llm_engineering/application/rag/retriever.py(lines 63-97)
See Also
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment