Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:CrewAIInc CrewAI RAG Search Defaults

From Leeroopedia
Knowledge Sources
Domains RAG, Knowledge_Management
Last Updated 2026-02-11 17:00 GMT

Overview

Default search parameters for CrewAI's Knowledge and Memory systems: `results_limit=5` and `score_threshold=0.6` applied consistently across all retrieval operations.

Description

CrewAI's Knowledge and Memory subsystems use identical default search parameters: a maximum of 5 results and a 0.6 (60%) similarity score threshold. These defaults are applied across `Knowledge.query()`, `Memory.search()`, and their async variants. The consistency ensures predictable behavior regardless of which retrieval path is used (short-term memory, long-term memory, entity memory, or knowledge sources).

Usage

Apply this heuristic when tuning RAG retrieval quality in CrewAI agents. If agents are missing relevant context, consider lowering `score_threshold` below 0.6. If agents are getting too much irrelevant context, increase `score_threshold` or reduce `results_limit`. The defaults work well for typical use cases but may need adjustment for domain-specific knowledge bases.

The Insight (Rule of Thumb)

  • Action: Use `results_limit=5` and `score_threshold=0.6` as defaults for all RAG queries
  • Value: `limit=5` returns the top 5 most relevant results; `score_threshold=0.6` filters out matches below 60% similarity
  • Trade-off: Higher `score_threshold` increases precision but may miss relevant results; lower threshold increases recall but adds noise. More results consume more context window tokens.
  • Consistency: These defaults are identical across `Knowledge.query()`, `Memory.search()`, and all memory subtypes (short-term, long-term, entity)

Reasoning

Returning 5 results balances relevance with context window consumption. In typical LLM interactions, 5 relevant chunks provide sufficient context without overwhelming the model's attention. The 0.6 threshold is a practical floor that filters out coincidental word matches while retaining semantically related content. These values were chosen to work across diverse use cases (document Q&A, agent memory recall, entity lookups) without requiring per-use-case tuning.

The consistency across all retrieval systems is deliberate: users can predict behavior regardless of whether context comes from Knowledge sources or Memory storage, reducing cognitive load when debugging retrieval quality.

Code Evidence

Knowledge query defaults from `lib/crewai/src/crewai/knowledge/knowledge.py:46-47`:

def query(
    self, query: list[str], results_limit: int = 5, score_threshold: float = 0.6
) -> list[SearchResult]:

Memory search defaults from `lib/crewai/src/crewai/memory/memory.py:76-80`:

def search(
    self,
    query: str,
    limit: int = 5,
    score_threshold: float = 0.6,
) -> list[Any]:

Async variants with same defaults from `lib/crewai/src/crewai/knowledge/knowledge.py:79-80`:

async def aquery(
    self, query: list[str], results_limit: int = 5, score_threshold: float = 0.6
) -> list[SearchResult]:

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment