Implementation:CrewAIInc CrewAI MongoDB Vector Search Tool
| Knowledge Sources | |
|---|---|
| Domains | Vector Search, Database Integration, RAG |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
MongoDBVectorSearchTool is a CrewAI tool that performs semantic vector search against MongoDB Atlas collections using OpenAI embeddings, enabling retrieval-augmented generation (RAG) workflows.
Description
The tool extends BaseTool with full MongoDB Atlas vector search integration. On initialization, it creates an OpenAI or AzureOpenAI client for generating embeddings and establishes a MongoDB connection with pymongo using a connection string and CrewAI driver info metadata.
The core search mechanism works as follows:
- The _run method embeds the query text using the configured OpenAI embedding model.
- It constructs a MongoDB aggregation pipeline with a $vectorSearch stage using the embedded query vector, configured index name, number of candidates (limit multiplied by oversampling_factor), and optional pre-filters.
- Results include a vectorSearchScore field and optionally exclude embedding vectors for efficiency.
- Post-filter pipelines can be applied for additional result processing.
The tool also supports document ingestion via add_texts, which batches text insertion with embeddings (max 100 documents or 47MB per batch) using ReplaceOne upsert operations with ObjectId-based IDs. The create_vector_search_index convenience method wraps utility functions for creating Atlas Search vector indexes.
The MongoDBVectorSearchConfig model controls query behavior including limit, pre_filter, post_filter_pipeline, oversampling_factor, and include_embeddings.
Usage
Use this tool when agents need semantic document retrieval from MongoDB Atlas collections. It is ideal for RAG applications, knowledge base search, and any workflow requiring similarity-based document lookup with MongoDB as the vector store.
Code Reference
Source Location
- Repository: CrewAI
- File: lib/crewai-tools/src/crewai_tools/tools/mongodb_vector_search_tool/vector_search.py
- Lines: 1-331
Signature
class MongoDBVectorSearchTool(BaseTool):
name: str = "MongoDBVectorSearchTool"
description: str = "A tool to perfrom a vector search on a MongoDB database for relevant information on internal documents."
args_schema: type[BaseModel] = MongoDBToolSchema
query_config: MongoDBVectorSearchConfig | None = None
embedding_model: str = "text-embedding-3-large"
vector_index_name: str = "vector_index"
text_key: str = "text"
embedding_key: str = "embedding"
database_name: str = Field(...)
collection_name: str = Field(...)
connection_string: str = Field(...)
dimensions: int = 1536
Import
from crewai_tools.tools.mongodb_vector_search_tool.vector_search import MongoDBVectorSearchTool
I/O Contract
Inputs (Runtime)
| Name | Type | Required | Description |
|---|---|---|---|
| query | str | Yes | The search query text to find semantically relevant documents |
Constructor Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| database_name | str | Yes | The name of the MongoDB database |
| collection_name | str | Yes | The name of the MongoDB collection |
| connection_string | str | Yes | The connection string of the MongoDB cluster |
| embedding_model | str | No | OpenAI embedding model (default: "text-embedding-3-large") |
| vector_index_name | str | No | Name of the Atlas Search vector index (default: "vector_index") |
| text_key | str | No | MongoDB field containing text for each document (default: "text") |
| embedding_key | str | No | Field containing the embedding vector (default: "embedding") |
| dimensions | int | No | Number of dimensions in the embedding vector (default: 1536) |
| query_config | MongoDBVectorSearchConfig | No | Configuration for search queries (limit, pre_filter, post_filter_pipeline, oversampling_factor, include_embeddings) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | str | JSON-encoded list of matching documents with scores, or empty string on error |
Usage Examples
Basic Usage
import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"
from crewai_tools.tools.mongodb_vector_search_tool.vector_search import (
MongoDBVectorSearchTool,
MongoDBVectorSearchConfig,
)
tool = MongoDBVectorSearchTool(
database_name="my_database",
collection_name="documents",
connection_string="mongodb+srv://user:pass@cluster.mongodb.net/",
query_config=MongoDBVectorSearchConfig(
limit=5,
oversampling_factor=10,
include_embeddings=False,
),
)
# Search for relevant documents
results = tool._run(query="machine learning best practices")
# Add documents to the collection
tool.add_texts(
texts=["Document one text", "Document two text"],
metadatas=[{"source": "web"}, {"source": "pdf"}],
)