Implementation:CrewAIInc CrewAI MongoDB Vector Search Tool

Knowledge Sources	CrewAI
Domains	Vector Search, Database Integration, RAG
Last Updated	2026-02-11 00:00 GMT

Overview

MongoDBVectorSearchTool is a CrewAI tool that performs semantic vector search against MongoDB Atlas collections using OpenAI embeddings, enabling retrieval-augmented generation (RAG) workflows.

Description

The tool extends BaseTool with full MongoDB Atlas vector search integration. On initialization, it creates an OpenAI or AzureOpenAI client for generating embeddings and establishes a MongoDB connection with pymongo using a connection string and CrewAI driver info metadata.

The core search mechanism works as follows:

The _run method embeds the query text using the configured OpenAI embedding model.
It constructs a MongoDB aggregation pipeline with a $vectorSearch stage using the embedded query vector, configured index name, number of candidates (limit multiplied by oversampling_factor), and optional pre-filters.
Results include a vectorSearchScore field and optionally exclude embedding vectors for efficiency.
Post-filter pipelines can be applied for additional result processing.

The tool also supports document ingestion via add_texts, which batches text insertion with embeddings (max 100 documents or 47MB per batch) using ReplaceOne upsert operations with ObjectId-based IDs. The create_vector_search_index convenience method wraps utility functions for creating Atlas Search vector indexes.

The MongoDBVectorSearchConfig model controls query behavior including limit, pre_filter, post_filter_pipeline, oversampling_factor, and include_embeddings.

Usage

Use this tool when agents need semantic document retrieval from MongoDB Atlas collections. It is ideal for RAG applications, knowledge base search, and any workflow requiring similarity-based document lookup with MongoDB as the vector store.

Code Reference

Source Location

Repository: CrewAI
File: lib/crewai-tools/src/crewai_tools/tools/mongodb_vector_search_tool/vector_search.py
Lines: 1-331

Signature

class MongoDBVectorSearchTool(BaseTool):
    name: str = "MongoDBVectorSearchTool"
    description: str = "A tool to perfrom a vector search on a MongoDB database for relevant information on internal documents."
    args_schema: type[BaseModel] = MongoDBToolSchema
    query_config: MongoDBVectorSearchConfig | None = None
    embedding_model: str = "text-embedding-3-large"
    vector_index_name: str = "vector_index"
    text_key: str = "text"
    embedding_key: str = "embedding"
    database_name: str = Field(...)
    collection_name: str = Field(...)
    connection_string: str = Field(...)
    dimensions: int = 1536

Import

from crewai_tools.tools.mongodb_vector_search_tool.vector_search import MongoDBVectorSearchTool

I/O Contract

Inputs (Runtime)

Name	Type	Required	Description
query	str	Yes	The search query text to find semantically relevant documents

Constructor Parameters

Name	Type	Required	Description
database_name	str	Yes	The name of the MongoDB database
collection_name	str	Yes	The name of the MongoDB collection
connection_string	str	Yes	The connection string of the MongoDB cluster
embedding_model	str	No	OpenAI embedding model (default: "text-embedding-3-large")
vector_index_name	str	No	Name of the Atlas Search vector index (default: "vector_index")
text_key	str	No	MongoDB field containing text for each document (default: "text")
embedding_key	str	No	Field containing the embedding vector (default: "embedding")
dimensions	int	No	Number of dimensions in the embedding vector (default: 1536)
query_config	MongoDBVectorSearchConfig	No	Configuration for search queries (limit, pre_filter, post_filter_pipeline, oversampling_factor, include_embeddings)

Outputs

Name	Type	Description
return	str	JSON-encoded list of matching documents with scores, or empty string on error

Usage Examples

Basic Usage

import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"

from crewai_tools.tools.mongodb_vector_search_tool.vector_search import (
    MongoDBVectorSearchTool,
    MongoDBVectorSearchConfig,
)

tool = MongoDBVectorSearchTool(
    database_name="my_database",
    collection_name="documents",
    connection_string="mongodb+srv://user:pass@cluster.mongodb.net/",
    query_config=MongoDBVectorSearchConfig(
        limit=5,
        oversampling_factor=10,
        include_embeddings=False,
    ),
)

# Search for relevant documents
results = tool._run(query="machine learning best practices")

# Add documents to the collection
tool.add_texts(
    texts=["Document one text", "Document two text"],
    metadatas=[{"source": "web"}, {"source": "pdf"}],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment