Workflow:Apache Shardingsphere Shadow Database Routing

Attribute	Value
Sources	Repository: apache/shardingsphere, Documentation: Shadow Feature
Domains	Distributed_Database, Shadow_Testing, SQL_Routing
Last Updated	2026-02-10 12:00 GMT

Overview

The Shadow Database Routing workflow is the primary execution pipeline that intercepts incoming SQL statements and routes them to either a production or shadow database based on hint annotations and column-value matching against configured shadow algorithms.

Description

Goal: Transparently redirect SQL traffic to a shadow database for testing, stress verification, or observability purposes without modifying application code or SQL statements.

Scope: This workflow covers the complete routing decision lifecycle from the moment a SQL statement enters the shadow routing decorator through statement type classification, algorithm evaluation, and final data source mapping resolution. It handles all four DML statement types (INSERT, SELECT, UPDATE, DELETE) as well as non-DML statements.

Strategy: The routing follows a two-tier evaluation strategy. First, hint-based shadow algorithms are evaluated, which operate on metadata annotations rather than SQL content. If no hint-based algorithm matches, the system falls back to column-based shadow algorithms that inspect actual column values within the SQL WHERE clause or INSERT values. This hint-first, column-fallback approach allows operators to use broad hint-based routing for general shadow traffic while retaining fine-grained column-value routing for targeted scenarios. For non-DML statements, only hint-based routing is available since there are no column values to evaluate.

Usage

Shadow testing environments: Use this workflow when deploying a parallel shadow database alongside production to validate schema changes, new queries, or data migration correctness without affecting real traffic.
Stress testing and performance profiling: Route synthetic or marked test traffic to a shadow database to measure performance characteristics under realistic conditions.
Data verification pipelines: Direct specific SQL operations to a shadow copy of the database for audit, compliance, or data integrity validation.
Canary deployments: Selectively route traffic matching specific hint flags or column value patterns to validate new database configurations before full rollout.

Execution Steps

Step 1: SQL Statement Interception

Component: ShadowSQLRouter

When a SQL statement arrives at the ShardingSphere routing layer, the ShadowSQLRouter acts as a decorator on the existing route context. It implements the DecorateSQLRouter interface, meaning it does not create a new route from scratch but instead modifies an already-established route context produced by upstream routers. The router receives the query context, the database metadata, the shadow rule configuration, and the current set of route units. It iterates over each existing route unit and identifies which data sources are governed by shadow rules by checking whether the actual data source name in each route unit corresponds to a known production data source within the shadow configuration. For each matching route unit, the router prepares to replace the data source target based on the shadow routing decision that will be computed in subsequent steps.

Step 2: Statement Type Classification

Component: ShadowDataSourceMappingsRetrieverFactory

The factory inspects the SQL statement context to determine the operation type. It checks whether the statement is an instance of InsertStatementContext, DeleteStatementContext, UpdateStatementContext, or SelectStatementContext. If the statement matches any of these four DML types, the factory assigns the corresponding ShadowOperationType (INSERT, DELETE, UPDATE, or SELECT) and creates a ShadowDMLStatementDataSourceMappingsRetriever, which will perform both hint-based and column-based routing evaluation. If the statement does not match any DML type (for example, DDL statements, utility commands, or other non-DML operations), the factory creates a ShadowHintDataSourceMappingsRetriever instead, which relies exclusively on hint-based algorithms since non-DML statements lack column values to evaluate. This classification step is the critical branching point that determines the entire downstream retrieval strategy.

Step 3: Hint Based Routing Attempt

Components: ShadowDMLStatementDataSourceMappingsRetriever, ShadowTableHintDataSourceMappingsRetriever, ShadowHintDataSourceMappingsRetriever

For DML statements, the ShadowDMLStatementDataSourceMappingsRetriever first filters the table names referenced in the SQL against the shadow rule to identify which tables are configured as shadow tables. It then delegates to the ShadowTableHintDataSourceMappingsRetriever, which attempts hint-based routing at the table level. If no shadow tables are found in the SQL, the retriever checks whether a default shadow algorithm is configured and whether it matches the current hint context. If a default algorithm matches, all shadow data source mappings are returned. If shadow tables are present, the retriever iterates through each shadow table and evaluates the table-specific hint shadow algorithms against a shadow condition constructed from the table name and the DML operation type. The first table whose hint algorithms produce a positive match yields its data source mappings as the result.

For non-DML statements, the ShadowHintDataSourceMappingsRetriever operates at a global level. It retrieves all hint shadow algorithms registered in the shadow rule and evaluates each one against a generic shadow condition. If any hint algorithm determines that the statement should be routed to a shadow database, the retriever returns all shadow data source mappings. If no hint algorithm matches, an empty mapping is returned, and the SQL is routed to the production database.

Step 4: Column Based Routing Fallback

Components: ShadowColumnDataSourceMappingsRetriever, ShadowInsertStatementDataSourceMappingsRetriever, ShadowSelectStatementDataSourceMappingsRetriever, ShadowUpdateStatementDataSourceMappingsRetriever, ShadowDeleteStatementDataSourceMappingsRetriever

If the hint-based routing attempt in the previous step produces an empty result for DML statements, the system falls back to column-based routing. The ShadowDMLStatementDataSourceMappingsRetriever invokes the statement-specific column retriever that was created during initialization. Each statement type has its own column extraction logic: the INSERT retriever inspects the insert column names and their corresponding literal values from each insert value context; the SELECT, UPDATE, and DELETE retrievers parse the WHERE clause segments, extract individual expressions, identify columns, and use the ShadowExtractor utility to resolve actual values from either literal expressions or parameter markers.

The abstract ShadowColumnDataSourceMappingsRetriever coordinates the column-based evaluation. For each shadow table, it retrieves the shadow column names configured for the specific operation type, then for each shadow column it obtains the column shadow algorithms and the shadow column conditions (which contain the extracted column values). It evaluates whether any combination of column condition and algorithm produces a shadow match. If a match is found for any shadow table, the corresponding data source mappings are returned.

Step 5: Shadow Algorithm Evaluation

Components: HintShadowAlgorithmDeterminer, ColumnShadowAlgorithmDeterminer, ShadowExtractor

The determiners serve as the core evaluation engine for shadow routing decisions. The HintShadowAlgorithmDeterminer constructs a PreciseHintShadowValue containing the table name, the operation type, and the boolean shadow hint flag. It passes this value along with the complete set of shadow table names to the hint shadow algorithm's isShadow method, which returns whether the statement should be routed to the shadow database.

The ColumnShadowAlgorithmDeterminer constructs PreciseColumnShadowValue objects for each value extracted from the shadow column condition. It evaluates each value against the column shadow algorithm. The determiner applies a strict matching rule: it verifies that the table name in the shadow condition matches the column condition's table, and it requires that the algorithm returns true for every value in the condition. If any value fails the algorithm check or any table name mismatch is detected, the column-based determination returns false.

The ShadowExtractor is a utility that handles value extraction from various SQL expression types. It recursively processes BinaryOperationExpression (extracting the right-hand side), InExpression (extracting the right-hand collection), ListExpression (iterating over list items), and SimpleExpressionSegment (extracting literal values or resolving parameter marker indices against the query parameters). Only values that implement the Comparable interface are accepted as valid shadow column values.

Step 6: Data Source Mapping Resolution

Component: ShadowSQLRouter, ShadowRule

Once the retrieval chain produces a mapping of production data source names to shadow data source names, the ShadowSQLRouter applies the mapping to the route context. For each route unit in the current context, the router looks up the actual data source name in the shadow rule to find the corresponding production data source name. If the production data source name has an entry in the shadow data source mappings (meaning a shadow algorithm determined this SQL should go to the shadow database), the route unit is replaced with a new route unit pointing to the shadow data source. If no mapping entry exists for that production data source, the route unit is replaced with one pointing back to the production data source, effectively keeping the original routing intact. The router removes all original route units that were processed and adds the new route units with the resolved data source names, completing the shadow routing decoration. The table mappers from the original route units are preserved, ensuring that only the data source target changes while table routing remains unaffected.

Execution Diagram

flowchart TD
    A["SQL Statement Arrives"] --> B["ShadowSQLRouter.decorateRouteContext()"]
    B --> C["ShadowDataSourceMappingsRetrieverFactory.newInstance()"]
    C --> D{"Statement Type?"}
    D -->|"INSERT / SELECT / UPDATE / DELETE"| E["ShadowDMLStatementDataSourceMappingsRetriever"]
    D -->|"Non-DML (DDL, etc.)"| F["ShadowHintDataSourceMappingsRetriever"]

    E --> G["Filter shadow tables from ShadowRule"]
    G --> H["ShadowTableHintDataSourceMappingsRetriever"]
    H --> I{"Hint algorithms match?"}
    I -->|"Yes"| J["Return shadow data source mappings"]
    I -->|"No"| K["Column-based fallback"]

    K --> L{"Statement subtype?"}
    L -->|"INSERT"| M["ShadowInsertStatementDataSourceMappingsRetriever"]
    L -->|"SELECT"| N["ShadowSelectStatementDataSourceMappingsRetriever"]
    L -->|"UPDATE"| O["ShadowUpdateStatementDataSourceMappingsRetriever"]
    L -->|"DELETE"| P["ShadowDeleteStatementDataSourceMappingsRetriever"]

    M --> Q["Extract column values from insert data"]
    N --> R["Extract column values from WHERE clause"]
    O --> R
    P --> R

    Q --> S["ShadowColumnDataSourceMappingsRetriever.retrieve()"]
    R --> S

    S --> T["ColumnShadowAlgorithmDeterminer.isShadow()"]
    T --> U{"Column algorithm matches?"}
    U -->|"Yes"| J
    U -->|"No"| V["Return empty mappings (use production)"]

    F --> W["Evaluate all hint shadow algorithms"]
    W --> X["HintShadowAlgorithmDeterminer.isShadow()"]
    X --> Y{"Any hint algorithm matches?"}
    Y -->|"Yes"| Z["Return all shadow data source mappings"]
    Y -->|"No"| V

    J --> AA["ShadowSQLRouter resolves route units"]
    Z --> AA
    V --> AA
    AA --> AB["Replace data source in RouteContext"]
    AB --> AC["SQL routed to shadow or production database"]

GitHub URL

Workflow Repository