Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:CrewAIInc CrewAI RAG MDX Loader

From Leeroopedia
Knowledge Sources
Domains RAG, Data_Loading, Text_Processing
Last Updated 2026-02-11 00:00 GMT

Overview

Loads MDX (Markdown with JSX) files and strips code-related syntax such as import/export statements and JSX tags to extract pure markdown content.

Description

MDXLoader extends BaseLoader to handle MDX files commonly used in modern documentation systems like Docusaurus and Next.js. It reads content from local files (with UTF-8 encoding) or fetches from URLs using the shared load_from_url utility with markdown-compatible accept headers.

The _parse_mdx() method applies four precompiled regex patterns in sequence:

  • _IMPORT_PATTERN: Removes all import statements (lines starting with "import").
  • _EXPORT_PATTERN: Removes all export statements (lines starting with "export").
  • _JSX_TAG_PATTERN: Removes JSX/HTML tags (anything matching <...>).
  • _EXTRA_NEWLINES_PATTERN: Normalizes excessive newlines (three or more consecutive) down to double newlines.

The result is clean markdown content with all JSX/React component syntax removed, making it suitable for text processing and embedding.

Usage

Import MDXLoader when you need to explicitly load MDX files. It is typically instantiated automatically by the DataType.MDX registry when .mdx or .md files are detected.

Code Reference

Source Location

  • Repository: CrewAI
  • File: lib/crewai-tools/src/crewai_tools/rag/loaders/mdx_loader.py
  • Lines: 1-61

Signature

class MDXLoader(BaseLoader):
    def load(self, source_content: SourceContent, **kwargs) -> LoaderResult: ...

Import

from crewai_tools.rag.loaders.mdx_loader import MDXLoader

I/O Contract

Inputs

Name Type Required Description
source_content SourceContent Yes Wraps an MDX/MD file path, URL, or raw MDX string
**kwargs Any No Additional keyword arguments passed to URL loading

Outputs

Name Type Description
return LoaderResult Contains cleaned markdown text with JSX syntax removed, source reference, and metadata (format: "mdx")

Usage Examples

Basic Usage

from crewai_tools.rag.loaders.mdx_loader import MDXLoader
from crewai_tools.rag.source_content import SourceContent

loader = MDXLoader()

# Load from a local MDX file
source = SourceContent("/path/to/docs/guide.mdx")
result = loader.load(source)

# Input might be:
# import { Component } from 'react'
# export const meta = { title: 'Guide' }
# # Getting Started
# <Component prop="value">
# Some content here.
# </Component>

# Output will be:
# # Getting Started
# Some content here.

print(result.metadata)
# {'format': 'mdx'}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment