Principle:Apache Shardingsphere YAML Deserialization

Knowledge Sources	Apache ShardingSphere ShardingSphere Shadow Docs
Domains	Configuration_Management, Shadow_Testing
Last Updated	2026-02-10 00:00 GMT

Overview

Deserializing YAML text into typed Java configuration objects using a safe YAML parsing engine.

Description

YAML Deserialization is the process of converting raw YAML text (from files, byte arrays, or strings) into strongly-typed Java objects. In a rule configuration workflow, this is the first transformation step: it takes human-readable YAML and produces an in-memory object graph that can be programmatically processed.

A safe YAML deserialization engine must address several concerns:

Type-Safe Construction: The engine must instantiate only known, whitelisted classes. Arbitrary class instantiation from YAML tags would create security vulnerabilities. A custom YAML constructor restricts which types can be materialized during parsing.
Null Handling: When the YAML content is empty or represents a null document, the engine must return a valid default instance of the target class rather than null. This is achieved by falling back to reflective construction (classType.getConstructor().newInstance()) when the parsed result is null.
Input Source Flexibility: The engine should accept multiple input forms -- files, byte arrays, and strings -- to support diverse loading scenarios such as file-based startup configuration, network-transferred configuration bytes, and in-memory configuration strings from governance centers.
Loader Options: The engine configures YAML loader options to limit document size and complexity, preventing denial-of-service attacks from maliciously crafted YAML input.

The deserialization step produces YAML-layer configuration objects (such as YamlShadowRuleConfiguration) that still need to be converted into domain configuration objects by a separate swapper component.

Usage

Use YAML deserialization whenever configuration needs to be loaded from an external source into the application. This occurs during application startup when reading YAML configuration files, during dynamic configuration reload from a governance center, or when processing configuration snippets received over the network. The deserialization step always precedes the swapper conversion step in the configuration loading pipeline.

Theoretical Basis

The safe YAML deserialization pattern follows this logic:

FUNCTION unmarshal(yamlInput, targetClass):
    constructor = createSafeYamlConstructor(targetClass)
    loaderOptions = createRestrictedLoaderOptions()
    yamlParser = new Yaml(constructor, representer, dumperOptions, loaderOptions)

    result = yamlParser.loadAs(yamlInput, targetClass)

    IF result IS NULL:
        RETURN targetClass.newInstance()   // default empty object
    ELSE:
        RETURN result
    END IF
END FUNCTION

The critical design decisions are:

Custom Constructor: ShardingSphereYamlConstructor restricts instantiation to safe types, preventing arbitrary code execution.
Null Fallback: A null parse result (from empty YAML) is converted to a default-constructed instance, ensuring downstream code never receives null.
Overloaded Inputs: Separate method signatures for File, byte[], and String inputs accommodate different configuration sources while maintaining the same safety guarantees.
Optional Property Skipping: A variant accepts a skipMissingProps flag that instructs the representer to ignore YAML keys that do not correspond to any Java field, enabling forward-compatible configuration parsing.

Related Pages

Implemented By

Implementation:Apache_Shardingsphere_YamlEngine_Unmarshal

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment