Workflow:Dagster io Dagster Modal Serverless Pipeline

Knowledge Sources	Dagster Dagster Docs Modal Pipeline Example
Domains	ML_Ops, Serverless, Audio_Processing
Last Updated	2026-02-10 12:00 GMT

Overview

End-to-end process for building a serverless GPU-powered podcast transcription pipeline using Dagster for orchestration and Modal for compute, with RSS-based detection and OpenAI-powered summarization.

Description

This workflow demonstrates how to orchestrate cross-environment compute using Dagster Pipes with the Modal serverless platform. It monitors podcast RSS feeds for new episodes using a Dagster sensor, downloads audio to Cloudflare R2 object storage, invokes GPU-powered transcription on Modal using OpenAI Whisper, and generates summaries using the OpenAI API. The pipeline showcases Dagster's ability to orchestrate external compute environments while maintaining full observability and lineage tracking through the Pipes protocol.

Usage

Execute this workflow when you need to process media files (audio, video, images) that require GPU compute but want to avoid managing GPU infrastructure. This pattern is appropriate for any workload where the orchestration logic runs in a standard environment but the heavy computation must execute on specialized hardware provisioned on demand. Requires Modal and OpenAI API accounts.

Execution Steps

Step 1: RSS Feed Monitoring

Configure a Dagster sensor that monitors podcast RSS feeds for new episodes by tracking etag headers. When a new episode is detected (etag change), the sensor triggers a pipeline run to process the new content. This provides event-driven execution without polling overhead.

Key considerations:

Sensors use etag-based change detection for efficient polling
Each new episode triggers an independent pipeline run
Sensor state persists between evaluations to track previously processed episodes
Multiple RSS feeds can be monitored by separate sensor instances

Step 2: Audio Download and Staging

Download the podcast audio file from the RSS feed URL and stage it to Cloudflare R2 object storage. R2 serves as a shared filesystem accessible by both the Dagster orchestration environment and the Modal compute environment. This decouples data transfer from compute scheduling.

Key considerations:

Cloudflare R2 provides S3-compatible object storage accessible from Modal
Audio files are staged before transcription to avoid re-downloading on retries
The download asset produces a storage key used by downstream transcription

Step 3: GPU_Powered Transcription

Invoke a Modal application that transcribes the staged audio using OpenAI Whisper on GPU hardware. Dagster Pipes (via ModalClient) manages the cross-environment execution, passing parameters to Modal and receiving transcription results back. The Modal application processes audio in parallel segments for efficiency.

Key considerations:

Dagster Pipes enables cross-environment orchestration without custom RPC
The ModalClient handles Modal app invocation and result collection
Audio is segmented for parallel GPU processing within Modal
The Modal application defines its own Python dependencies and GPU requirements
Transcription results flow back to Dagster through the Pipes protocol

Step 4: Summary Generation and Notification

Process the transcription text through the OpenAI API to generate a concise summary of the podcast episode. The summary and associated metadata are recorded as Dagster asset materializations. Notifications can be dispatched to alert subscribers of new transcription availability.

Key considerations:

Summarization uses OpenAI's chat completion API with the full transcript as context
MaterializeResult records both the summary text and episode metadata
The summary asset depends on the transcription asset for automatic lineage
Notification dispatch can integrate with Slack, email, or other channels

Execution Diagram

GitHub URL

Workflow Repository