Workflow:Dagster io Dagster Modal Serverless Pipeline
| Knowledge Sources | |
|---|---|
| Domains | ML_Ops, Serverless, Audio_Processing |
| Last Updated | 2026-02-10 12:00 GMT |
Overview
End-to-end process for building a serverless GPU-powered podcast transcription pipeline using Dagster for orchestration and Modal for compute, with RSS-based detection and OpenAI-powered summarization.
Description
This workflow demonstrates how to orchestrate cross-environment compute using Dagster Pipes with the Modal serverless platform. It monitors podcast RSS feeds for new episodes using a Dagster sensor, downloads audio to Cloudflare R2 object storage, invokes GPU-powered transcription on Modal using OpenAI Whisper, and generates summaries using the OpenAI API. The pipeline showcases Dagster's ability to orchestrate external compute environments while maintaining full observability and lineage tracking through the Pipes protocol.
Usage
Execute this workflow when you need to process media files (audio, video, images) that require GPU compute but want to avoid managing GPU infrastructure. This pattern is appropriate for any workload where the orchestration logic runs in a standard environment but the heavy computation must execute on specialized hardware provisioned on demand. Requires Modal and OpenAI API accounts.
Execution Steps
Step 1: RSS Feed Monitoring
Configure a Dagster sensor that monitors podcast RSS feeds for new episodes by tracking etag headers. When a new episode is detected (etag change), the sensor triggers a pipeline run to process the new content. This provides event-driven execution without polling overhead.
Key considerations:
- Sensors use etag-based change detection for efficient polling
- Each new episode triggers an independent pipeline run
- Sensor state persists between evaluations to track previously processed episodes
- Multiple RSS feeds can be monitored by separate sensor instances
Step 2: Audio Download and Staging
Download the podcast audio file from the RSS feed URL and stage it to Cloudflare R2 object storage. R2 serves as a shared filesystem accessible by both the Dagster orchestration environment and the Modal compute environment. This decouples data transfer from compute scheduling.
Key considerations:
- Cloudflare R2 provides S3-compatible object storage accessible from Modal
- Audio files are staged before transcription to avoid re-downloading on retries
- The download asset produces a storage key used by downstream transcription
Step 3: GPU_Powered Transcription
Invoke a Modal application that transcribes the staged audio using OpenAI Whisper on GPU hardware. Dagster Pipes (via ModalClient) manages the cross-environment execution, passing parameters to Modal and receiving transcription results back. The Modal application processes audio in parallel segments for efficiency.
Key considerations:
- Dagster Pipes enables cross-environment orchestration without custom RPC
- The ModalClient handles Modal app invocation and result collection
- Audio is segmented for parallel GPU processing within Modal
- The Modal application defines its own Python dependencies and GPU requirements
- Transcription results flow back to Dagster through the Pipes protocol
Step 4: Summary Generation and Notification
Process the transcription text through the OpenAI API to generate a concise summary of the podcast episode. The summary and associated metadata are recorded as Dagster asset materializations. Notifications can be dispatched to alert subscribers of new transcription availability.
Key considerations:
- Summarization uses OpenAI's chat completion API with the full transcript as context
- MaterializeResult records both the summary text and episode metadata
- The summary asset depends on the transcription asset for automatic lineage
- Notification dispatch can integrate with Slack, email, or other channels