Implementation:CrewAIInc CrewAI RAG YouTube Channel Loader
| Knowledge Sources | |
|---|---|
| Domains | RAG, Data_Loading, Web_Scraping |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Extracts comprehensive content from YouTube channels including channel metadata, video information, and transcript previews for multiple videos.
Description
YoutubeChannelLoader extends BaseLoader to process YouTube channels. It requires the pytube library (lazily imported) and validates URLs against four supported channel URL formats: /channel/, /c/, /@, and /user/.
The loading process proceeds in stages: 1. Channel metadata: Uses pytube's Channel class to fetch channel_name, channel_id, and total video count. 2. Video enumeration: Collects video URLs up to the max_videos limit (default 10, configurable via kwargs). 3. Video details: For each video, uses pytube.YouTube to fetch title and description (truncated to 200 characters). 4. Transcript extraction: Uses youtube-transcript-api to fetch transcript previews (limited to 500 characters per video). The language selection follows a fallback strategy: manual English, then generated English, then any available transcript.
The _extract_video_id() static method uses regex to parse video IDs from various YouTube URL formats. If pytube or youtube-transcript-api is unavailable, the loader degrades gracefully to listing only video URLs.
The returned LoaderResult includes structured text with channel overview and per-video summaries, along with metadata tracking channel name, ID, loaded video count, and total video count.
Usage
Import YoutubeChannelLoader when you need to load YouTube channel content. It is typically instantiated automatically by the DataType.YOUTUBE_CHANNEL registry when YouTube channel URLs are detected.
Code Reference
Source Location
- Repository: CrewAI
- File: lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py
- Lines: 1-162
Signature
class YoutubeChannelLoader(BaseLoader):
def load(self, source: SourceContent, **kwargs) -> LoaderResult: ...
Import
from crewai_tools.rag.loaders.youtube_channel_loader import YoutubeChannelLoader
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| source | SourceContent | Yes | Wraps a YouTube channel URL (/channel/, /c/, /@, or /user/ format) |
| max_videos | int | No | Maximum number of videos to load (default 10, passed via kwargs) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | LoaderResult | Structured text with channel overview and per-video summaries; metadata includes source URL, channel_name, channel_id, num_videos_loaded, and total_videos |
Usage Examples
Basic Usage
from crewai_tools.rag.loaders.youtube_channel_loader import YoutubeChannelLoader
from crewai_tools.rag.source_content import SourceContent
loader = YoutubeChannelLoader()
source = SourceContent("https://www.youtube.com/@ExampleChannel")
result = loader.load(source, max_videos=5)
print(result.content)
# YouTube Channel: Example Channel
# Channel ID: UC...
# Total Videos: 150
# Videos Loaded: 5
#
# --- Video Summaries ---
#
# 1. Video Title
# URL: https://www.youtube.com/watch?v=...
# Description: First 200 characters...
# Transcript Preview: First 500 characters of transcript...
print(result.metadata)
# {'source': 'https://...', 'data_type': 'youtube_channel', 'channel_name': 'Example Channel', ...}