Workflow:Risingwavelabs Risingwave Sink Connector Pipeline

Knowledge Sources	RisingWave RisingWave Docs Sink Connectors
Domains	Stream_Processing, Data_Engineering, Connectors
Last Updated	2026-02-09 12:00 GMT

Overview

End-to-end process for transforming streaming data in RisingWave and delivering results to external downstream systems through sink connectors including databases, search engines, message queues, and data warehouses.

Description

This workflow describes the standard pattern for building streaming data delivery pipelines where RisingWave acts as the transformation hub. Data is ingested from streaming sources, transformed via SQL materialized views, and then delivered to one or more downstream systems through RisingWave's extensive sink connector library. Supported sinks include JDBC databases (PostgreSQL, MySQL, SQL Server, Snowflake), Elasticsearch/OpenSearch, Cassandra, Redis, Kafka, NATS, MQTT, ClickHouse, StarRocks, Doris, DynamoDB, BigQuery, and Apache Pinot.

The process covers:

Goal: Transformed streaming data continuously delivered to external systems in near real-time.
Scope: From source ingestion through transformation to delivery at one or more sinks.
Strategy: Uses RisingWave's native Rust sinks and Java connector node sinks (via JNI bridge) with exactly-once or at-least-once delivery guarantees depending on the connector.

Usage

Execute this workflow when you need to deliver transformed streaming data to external systems such as databases, search engines, data warehouses, or message queues. This is the standard pattern for real-time data enrichment and distribution pipelines, operational analytics feeds, and search index maintenance.

Execution Steps

Step 1: Deploy RisingWave and Target System

Start RisingWave alongside the target downstream system. For integration testing, Docker Compose configurations are provided for each supported sink. Ensure network connectivity between RisingWave and the target system.

Key considerations:

JDBC sinks (PostgreSQL, MySQL, Snowflake, SQL Server) require the Java connector node
Elasticsearch, Cassandra, and other Java-based sinks also use the connector node via JNI
Native Rust sinks (Kafka, Redis, S3) connect directly without the Java layer

Step 2: Prepare Target System

Create the necessary schemas, tables, indexes, or topics in the target system that will receive the data from RisingWave. The target schema must be compatible with the data types produced by the materialized view.

Key considerations:

Target table primary keys must match if using upsert mode
Some sinks require specific preparation (Snowflake needs S3 staging, BigQuery needs datasets, Cassandra needs keyspaces)
Elasticsearch requires index mappings compatible with the data schema

Step 3: Create Streaming Source

Define a source connector in RisingWave to ingest data from your message broker or data generator. This provides the raw input data that will be transformed before delivery.

Key considerations:

Match schema and format to incoming data
Multiple sources can feed into the same transformation pipeline
Built-in datagen source is available for testing without external dependencies

Step 4: Define Materialized Views

Create materialized views that transform, aggregate, filter, or enrich the source data. These views produce the output that will be delivered to the sink.

Key considerations:

Transformation logic should produce output compatible with the target system schema
Consider the sink's capabilities when designing transformations (append-only vs. upsert)
Use multiple views for staged transformations

Step 5: Create Sink Connector

Define a sink that connects a materialized view or table to the target system. Specify the connector type, connection parameters, target table/topic, and write mode (append-only or upsert).

Key considerations:

Choose the appropriate connector type matching your target system
Configure authentication credentials and connection parameters
Select append-only for immutable event logs or upsert for maintaining synchronized replicas
The sink factory validates configuration before starting data delivery

Step 6: Verify Data Delivery

Query the target system to verify that data has been delivered correctly. Check row counts, data freshness, and content accuracy against expected results.

Key considerations:

Each integration test includes a verification script (sink_check.py) as a reference
Monitor connector metrics for throughput and error rates
Check RisingWave logs and connector node logs for delivery errors

Execution Diagram

GitHub URL

Workflow Repository