Implementation:ArroyoSystems Arroyo Json Schema Converter
Appearance
Overview
JSON Schema Converter converts JSON Schema definitions into Arrow schemas. It uses the typify crate to parse JSON Schema into a type space, then maps the structural types to Arrow data types, supporting nested objects, arrays, optional fields, and various primitive types.
Description
The module provides:
to_arrow: The main entry point that takes a schema name and a JSON Schema string, produces an ArrowSchema. It internally creates aTypeSpacefrom the parsed JSON Schema, locates the root struct type, and recursively converts it.
get_type_space: Internal helper that parses the JSON Schema string into aRootSchema, assigns the root name"ArroyoJsonRoot", and creates aTypeSpacewith custom derive attributes (bincode::Encode, bincode::Decode, PartialEq, PartialOrd).
to_arrow_datatype: Recursive type conversion that mapstypifytypes to Arrow types:- Structs ->
DataType::Structwith fields from properties - Strings ->
DataType::Utf8 - Integers ->
DataType::Int64 - Floats ->
DataType::Float64 - Booleans ->
DataType::Boolean - Arrays ->
DataType::List - Optional/nullable types -> marked nullable in the Arrow field
- Date-time format strings ->
DataType::Timestamp(Nanosecond, None) - Unrecognized/complex types ->
DataType::Utf8with JSON extension metadata
- Structs ->
Usage
This is called during CREATE TABLE processing when a user specifies a JSON Schema for their source or sink format configuration.
Code Reference
Source Location
crates/arroyo-formats/src/json/schema.rs
Signature
pub const ROOT_NAME: &str = "ArroyoJsonRoot";
pub fn to_arrow(name: &str, schema: &str) -> anyhow::Result<arrow_schema::Schema>
Import
use arroyo_formats::json::schema::to_arrow;
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
| name | &str |
Name for the root schema type (for error messages) |
| schema | &str |
JSON Schema string to parse and convert |
Outputs
| Name | Type | Description |
|---|---|---|
| arrow_schema | arrow_schema::Schema |
Arrow schema derived from the JSON Schema definition |
Usage Examples
let json_schema = r#"{
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" },
"email": { "type": "string" }
},
"required": ["name", "age"]
}"#;
let arrow_schema = to_arrow("User", json_schema)?;
// Schema with fields: name (Utf8, non-nullable), age (Int64, non-nullable), email (Utf8, nullable)
Related Pages
- ArroyoSystems_Arroyo_Json_Schema_Module - JSON schema generation (Arrow to JSON direction)
- ArroyoSystems_Arroyo_Avro_Schema_Converter - Similar conversion for Avro schemas
- ArroyoSystems_Arroyo_Proto_Schema_Converter - Similar conversion for Protobuf schemas
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment