Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:CrewAIInc CrewAI RAG YouTube Channel Loader

From Leeroopedia
Knowledge Sources
Domains RAG, Data_Loading, Web_Scraping
Last Updated 2026-02-11 00:00 GMT

Overview

Extracts comprehensive content from YouTube channels including channel metadata, video information, and transcript previews for multiple videos.

Description

YoutubeChannelLoader extends BaseLoader to process YouTube channels. It requires the pytube library (lazily imported) and validates URLs against four supported channel URL formats: /channel/, /c/, /@, and /user/.

The loading process proceeds in stages: 1. Channel metadata: Uses pytube's Channel class to fetch channel_name, channel_id, and total video count. 2. Video enumeration: Collects video URLs up to the max_videos limit (default 10, configurable via kwargs). 3. Video details: For each video, uses pytube.YouTube to fetch title and description (truncated to 200 characters). 4. Transcript extraction: Uses youtube-transcript-api to fetch transcript previews (limited to 500 characters per video). The language selection follows a fallback strategy: manual English, then generated English, then any available transcript.

The _extract_video_id() static method uses regex to parse video IDs from various YouTube URL formats. If pytube or youtube-transcript-api is unavailable, the loader degrades gracefully to listing only video URLs.

The returned LoaderResult includes structured text with channel overview and per-video summaries, along with metadata tracking channel name, ID, loaded video count, and total video count.

Usage

Import YoutubeChannelLoader when you need to load YouTube channel content. It is typically instantiated automatically by the DataType.YOUTUBE_CHANNEL registry when YouTube channel URLs are detected.

Code Reference

Source Location

  • Repository: CrewAI
  • File: lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py
  • Lines: 1-162

Signature

class YoutubeChannelLoader(BaseLoader):
    def load(self, source: SourceContent, **kwargs) -> LoaderResult: ...

Import

from crewai_tools.rag.loaders.youtube_channel_loader import YoutubeChannelLoader

I/O Contract

Inputs

Name Type Required Description
source SourceContent Yes Wraps a YouTube channel URL (/channel/, /c/, /@, or /user/ format)
max_videos int No Maximum number of videos to load (default 10, passed via kwargs)

Outputs

Name Type Description
return LoaderResult Structured text with channel overview and per-video summaries; metadata includes source URL, channel_name, channel_id, num_videos_loaded, and total_videos

Usage Examples

Basic Usage

from crewai_tools.rag.loaders.youtube_channel_loader import YoutubeChannelLoader
from crewai_tools.rag.source_content import SourceContent

loader = YoutubeChannelLoader()

source = SourceContent("https://www.youtube.com/@ExampleChannel")
result = loader.load(source, max_videos=5)

print(result.content)
# YouTube Channel: Example Channel
# Channel ID: UC...
# Total Videos: 150
# Videos Loaded: 5
#
# --- Video Summaries ---
#
# 1. Video Title
#    URL: https://www.youtube.com/watch?v=...
#    Description: First 200 characters...
#    Transcript Preview: First 500 characters of transcript...

print(result.metadata)
# {'source': 'https://...', 'data_type': 'youtube_channel', 'channel_name': 'Example Channel', ...}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment