Implementation:CrewAIInc CrewAI RAG Directory Loader

Knowledge Sources	CrewAI
Domains	RAG, Data_Loading
Last Updated	2026-02-11 00:00 GMT

Overview

Recursively loads and processes all files from a local directory, automatically selecting the appropriate loader for each file type and aggregating content into a single result.

Description

DirectoryLoader extends BaseLoader to traverse directory structures and process files in bulk. It supports several filtering options: recursive traversal (default True), include_extensions and exclude_extensions lists for file type filtering, and max_files to cap the number of processed files. Hidden files and directories (starting with ".") are automatically skipped.

For each file found, the loader uses DataTypes.from_content() to detect the file type and then delegates to the appropriate specialized loader (e.g., PDFLoader for .pdf files, JSONLoader for .json files). Each file's content is prefixed with a header ("=== File: path ===") and errors during individual file processing are captured without halting the overall operation.

The returned LoaderResult contains the combined content from all files and comprehensive metadata including total_files found, processed_files count, errors count, individual file_details (path, metadata, source), and error_details for any failures.

URL directory loading is explicitly not supported; only local filesystem paths are accepted.

Usage

Import DirectoryLoader when you need to ingest an entire directory of documents into the RAG knowledge base. It is typically instantiated automatically by the DataType.DIRECTORY registry.

Code Reference

Source Location

Repository: CrewAI
File: lib/crewai-tools/src/crewai_tools/rag/loaders/directory_loader.py
Lines: 1-166

Signature

class DirectoryLoader(BaseLoader):
    def load(self, source_content: SourceContent, **kwargs) -> LoaderResult: ...

Import

from crewai_tools.rag.loaders.directory_loader import DirectoryLoader

I/O Contract

Inputs

Name	Type	Required	Description
source_content	SourceContent	Yes	Wraps a local directory path
recursive	bool	No	Whether to search recursively (default True, passed via kwargs)
include_extensions	list[str]	No	Only include files with these extensions (passed via kwargs)
exclude_extensions	list[str]	No	Exclude files with these extensions (passed via kwargs)
max_files	int	No	Maximum number of files to process (passed via kwargs)

Outputs

Name	Type	Description
return	LoaderResult	Combined content from all files with metadata including total_files, processed_files, errors, file_details, and error_details

Usage Examples

Basic Usage

from crewai_tools.rag.loaders.directory_loader import DirectoryLoader
from crewai_tools.rag.source_content import SourceContent

loader = DirectoryLoader()

# Load all files from a directory recursively
source = SourceContent("/path/to/documents/")
result = loader.load(source)
print(f"Processed {result.metadata['processed_files']} of {result.metadata['total_files']} files")

# Load with filtering
result = loader.load(
    source,
    include_extensions=[".pdf", ".txt", ".md"],
    max_files=50,
    recursive=True,
)

Related Pages

Principle:CrewAIInc_CrewAI_Knowledge_Ingestion

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment