Principle:Apache Shardingsphere YAML Deserialization
| Knowledge Sources | |
|---|---|
| Domains | Configuration_Management, Shadow_Testing |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Deserializing YAML text into typed Java configuration objects using a safe YAML parsing engine.
Description
YAML Deserialization is the process of converting raw YAML text (from files, byte arrays, or strings) into strongly-typed Java objects. In a rule configuration workflow, this is the first transformation step: it takes human-readable YAML and produces an in-memory object graph that can be programmatically processed.
A safe YAML deserialization engine must address several concerns:
- Type-Safe Construction: The engine must instantiate only known, whitelisted classes. Arbitrary class instantiation from YAML tags would create security vulnerabilities. A custom YAML constructor restricts which types can be materialized during parsing.
- Null Handling: When the YAML content is empty or represents a null document, the engine must return a valid default instance of the target class rather than null. This is achieved by falling back to reflective construction (
classType.getConstructor().newInstance()) when the parsed result is null. - Input Source Flexibility: The engine should accept multiple input forms -- files, byte arrays, and strings -- to support diverse loading scenarios such as file-based startup configuration, network-transferred configuration bytes, and in-memory configuration strings from governance centers.
- Loader Options: The engine configures YAML loader options to limit document size and complexity, preventing denial-of-service attacks from maliciously crafted YAML input.
The deserialization step produces YAML-layer configuration objects (such as YamlShadowRuleConfiguration) that still need to be converted into domain configuration objects by a separate swapper component.
Usage
Use YAML deserialization whenever configuration needs to be loaded from an external source into the application. This occurs during application startup when reading YAML configuration files, during dynamic configuration reload from a governance center, or when processing configuration snippets received over the network. The deserialization step always precedes the swapper conversion step in the configuration loading pipeline.
Theoretical Basis
The safe YAML deserialization pattern follows this logic:
FUNCTION unmarshal(yamlInput, targetClass):
constructor = createSafeYamlConstructor(targetClass)
loaderOptions = createRestrictedLoaderOptions()
yamlParser = new Yaml(constructor, representer, dumperOptions, loaderOptions)
result = yamlParser.loadAs(yamlInput, targetClass)
IF result IS NULL:
RETURN targetClass.newInstance() // default empty object
ELSE:
RETURN result
END IF
END FUNCTION
The critical design decisions are:
- Custom Constructor:
ShardingSphereYamlConstructorrestricts instantiation to safe types, preventing arbitrary code execution. - Null Fallback: A null parse result (from empty YAML) is converted to a default-constructed instance, ensuring downstream code never receives null.
- Overloaded Inputs: Separate method signatures for
File,byte[], andStringinputs accommodate different configuration sources while maintaining the same safety guarantees. - Optional Property Skipping: A variant accepts a
skipMissingPropsflag that instructs the representer to ignore YAML keys that do not correspond to any Java field, enabling forward-compatible configuration parsing.