Implementation:CrewAIInc CrewAI RAG MDX Loader
| Knowledge Sources | |
|---|---|
| Domains | RAG, Data_Loading, Text_Processing |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Loads MDX (Markdown with JSX) files and strips code-related syntax such as import/export statements and JSX tags to extract pure markdown content.
Description
MDXLoader extends BaseLoader to handle MDX files commonly used in modern documentation systems like Docusaurus and Next.js. It reads content from local files (with UTF-8 encoding) or fetches from URLs using the shared load_from_url utility with markdown-compatible accept headers.
The _parse_mdx() method applies four precompiled regex patterns in sequence:
- _IMPORT_PATTERN: Removes all import statements (lines starting with "import").
- _EXPORT_PATTERN: Removes all export statements (lines starting with "export").
- _JSX_TAG_PATTERN: Removes JSX/HTML tags (anything matching <...>).
- _EXTRA_NEWLINES_PATTERN: Normalizes excessive newlines (three or more consecutive) down to double newlines.
The result is clean markdown content with all JSX/React component syntax removed, making it suitable for text processing and embedding.
Usage
Import MDXLoader when you need to explicitly load MDX files. It is typically instantiated automatically by the DataType.MDX registry when .mdx or .md files are detected.
Code Reference
Source Location
- Repository: CrewAI
- File: lib/crewai-tools/src/crewai_tools/rag/loaders/mdx_loader.py
- Lines: 1-61
Signature
class MDXLoader(BaseLoader):
def load(self, source_content: SourceContent, **kwargs) -> LoaderResult: ...
Import
from crewai_tools.rag.loaders.mdx_loader import MDXLoader
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| source_content | SourceContent | Yes | Wraps an MDX/MD file path, URL, or raw MDX string |
| **kwargs | Any | No | Additional keyword arguments passed to URL loading |
Outputs
| Name | Type | Description |
|---|---|---|
| return | LoaderResult | Contains cleaned markdown text with JSX syntax removed, source reference, and metadata (format: "mdx") |
Usage Examples
Basic Usage
from crewai_tools.rag.loaders.mdx_loader import MDXLoader
from crewai_tools.rag.source_content import SourceContent
loader = MDXLoader()
# Load from a local MDX file
source = SourceContent("/path/to/docs/guide.mdx")
result = loader.load(source)
# Input might be:
# import { Component } from 'react'
# export const meta = { title: 'Guide' }
# # Getting Started
# <Component prop="value">
# Some content here.
# </Component>
# Output will be:
# # Getting Started
# Some content here.
print(result.metadata)
# {'format': 'mdx'}