Implementation:Apache Paimon JsonUtil
| Knowledge Sources | |
|---|---|
| Domains | Data Serialization, JSON Processing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
JsonUtil provides utilities for JSON serialization and deserialization of Python dataclasses with support for custom field naming, nested objects, and optional field inclusion.
Description
The JsonUtil module centers around the JSON class which provides bidirectional conversion between Python dataclasses and JSON strings. It uses metadata stored in dataclass fields (via json_field() and optional_json_field() helpers) to map between Python attribute names and JSON field names, enabling clean Pythonic code while maintaining compatibility with external JSON schemas.
The to_json() method serializes dataclass instances to JSON strings, recursively handling nested dataclasses and lists of dataclasses. It respects json_include metadata to conditionally include or exclude null fields, supporting the "non_null" directive to omit None values from serialization. The method first checks for custom to_dict() methods on objects before falling back to automatic field-by-field conversion.
The from_json() method deserializes JSON strings into dataclass instances, parsing the JSON and reconstructing objects based on field type annotations. It intelligently handles Optional types, nested dataclasses, and List types containing dataclasses, automatically detecting and processing complex type structures. The module supports custom from_dict() methods on dataclasses for specialized deserialization logic, providing flexibility while maintaining automatic behavior as the default.
Usage
Use JsonUtil when serializing Paimon dataclasses to JSON for REST APIs, deserializing API responses into typed Python objects, or implementing custom serialization with field name mapping and conditional inclusion.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/common/json_util.py
Signature
T = TypeVar("T")
def json_field(json_name: str, **kwargs):
"""Create a field with custom JSON name"""
pass
def optional_json_field(json_name: str, json_include: str):
"""Create a field with custom JSON name"""
pass
class JSON:
@staticmethod
def to_json(obj: Any, **kwargs) -> str:
"""Convert to JSON string"""
pass
@staticmethod
def from_json(json_str: str, target_class: Type[T]) -> T:
"""Create instance from JSON string"""
pass
@staticmethod
def __to_dict(obj: Any) -> Dict[str, Any]:
"""Convert to dictionary with custom field names"""
pass
@staticmethod
def __from_dict(data: Dict[str, Any], target_class: Type[T]) -> T:
"""Create instance from dictionary"""
pass
Import
from pypaimon.common.json_util import JSON, json_field, optional_json_field
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| obj | Any | Yes | Dataclass instance to serialize |
| json_str | str | Yes | JSON string to deserialize |
| target_class | Type[T] | Yes | Target dataclass type for deserialization |
| json_name | str | Yes | Custom JSON field name |
| json_include | str | No | Include condition ("non_null" to exclude None values) |
Outputs
| Name | Type | Description |
|---|---|---|
| json_str | str | Serialized JSON string |
| instance | T | Deserialized dataclass instance |
Usage Examples
from dataclasses import dataclass
from typing import Optional, List
from pypaimon.common.json_util import JSON, json_field, optional_json_field
# Define dataclass with custom JSON field names
@dataclass
class User:
user_id: str = json_field("userId")
user_name: str = json_field("userName")
email: Optional[str] = optional_json_field("emailAddress", "non_null")
age: Optional[int] = json_field("age", default=None)
# Serialize to JSON
user = User(user_id="123", user_name="alice", email="alice@example.com", age=30)
json_str = JSON.to_json(user)
print(json_str)
# {"userId": "123", "userName": "alice", "emailAddress": "alice@example.com", "age": 30}
# Non-null inclusion - omits None values
user_no_email = User(user_id="456", user_name="bob", email=None, age=25)
json_str = JSON.to_json(user_no_email)
# {"userId": "456", "userName": "bob", "age": 25} # emailAddress omitted
# Deserialize from JSON
json_input = '{"userId": "789", "userName": "charlie", "emailAddress": "charlie@example.com", "age": 35}'
user_obj = JSON.from_json(json_input, User)
print(user_obj.user_name) # "charlie"
print(user_obj.email) # "charlie@example.com"
# Nested dataclasses
@dataclass
class Address:
street: str = json_field("street")
city: str = json_field("city")
@dataclass
class Person:
name: str = json_field("name")
address: Address = json_field("address")
person = Person(
name="Dave",
address=Address(street="123 Main St", city="Springfield")
)
json_str = JSON.to_json(person)
# {"name": "Dave", "address": {"street": "123 Main St", "city": "Springfield"}}
person_obj = JSON.from_json(json_str, Person)
print(person_obj.address.city) # "Springfield"
# Lists of dataclasses
@dataclass
class Team:
team_name: str = json_field("teamName")
members: List[User] = json_field("members")
team = Team(
team_name="Engineering",
members=[
User(user_id="1", user_name="alice", email=None),
User(user_id="2", user_name="bob", email="bob@example.com")
]
)
json_str = JSON.to_json(team)
team_obj = JSON.from_json(json_str, Team)
print(f"Team has {len(team_obj.members)} members")
# Custom to_dict/from_dict methods
@dataclass
class CustomClass:
value: int
def to_dict(self):
return {"custom_value": self.value * 2}
@classmethod
def from_dict(cls, data):
return cls(value=data["custom_value"] // 2)
custom = CustomClass(value=10)
json_str = JSON.to_json(custom) # Uses to_dict()
print(json_str) # {"custom_value": 20}