Heuristic:CrewAIInc CrewAI RAG Search Defaults
| Knowledge Sources | |
|---|---|
| Domains | RAG, Knowledge_Management |
| Last Updated | 2026-02-11 17:00 GMT |
Overview
Default search parameters for CrewAI's Knowledge and Memory systems: `results_limit=5` and `score_threshold=0.6` applied consistently across all retrieval operations.
Description
CrewAI's Knowledge and Memory subsystems use identical default search parameters: a maximum of 5 results and a 0.6 (60%) similarity score threshold. These defaults are applied across `Knowledge.query()`, `Memory.search()`, and their async variants. The consistency ensures predictable behavior regardless of which retrieval path is used (short-term memory, long-term memory, entity memory, or knowledge sources).
Usage
Apply this heuristic when tuning RAG retrieval quality in CrewAI agents. If agents are missing relevant context, consider lowering `score_threshold` below 0.6. If agents are getting too much irrelevant context, increase `score_threshold` or reduce `results_limit`. The defaults work well for typical use cases but may need adjustment for domain-specific knowledge bases.
The Insight (Rule of Thumb)
- Action: Use `results_limit=5` and `score_threshold=0.6` as defaults for all RAG queries
- Value: `limit=5` returns the top 5 most relevant results; `score_threshold=0.6` filters out matches below 60% similarity
- Trade-off: Higher `score_threshold` increases precision but may miss relevant results; lower threshold increases recall but adds noise. More results consume more context window tokens.
- Consistency: These defaults are identical across `Knowledge.query()`, `Memory.search()`, and all memory subtypes (short-term, long-term, entity)
Reasoning
Returning 5 results balances relevance with context window consumption. In typical LLM interactions, 5 relevant chunks provide sufficient context without overwhelming the model's attention. The 0.6 threshold is a practical floor that filters out coincidental word matches while retaining semantically related content. These values were chosen to work across diverse use cases (document Q&A, agent memory recall, entity lookups) without requiring per-use-case tuning.
The consistency across all retrieval systems is deliberate: users can predict behavior regardless of whether context comes from Knowledge sources or Memory storage, reducing cognitive load when debugging retrieval quality.
Code Evidence
Knowledge query defaults from `lib/crewai/src/crewai/knowledge/knowledge.py:46-47`:
def query(
self, query: list[str], results_limit: int = 5, score_threshold: float = 0.6
) -> list[SearchResult]:
Memory search defaults from `lib/crewai/src/crewai/memory/memory.py:76-80`:
def search(
self,
query: str,
limit: int = 5,
score_threshold: float = 0.6,
) -> list[Any]:
Async variants with same defaults from `lib/crewai/src/crewai/knowledge/knowledge.py:79-80`:
async def aquery(
self, query: list[str], results_limit: int = 5, score_threshold: float = 0.6
) -> list[SearchResult]: