Implementation:Apache Paimon RenamingSnapshotCommit
| Knowledge Sources | |
|---|---|
| Domains | Snapshot Management, File System Operations |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
RenamingSnapshotCommit is a SnapshotCommit implementation that uses atomic file renaming to commit snapshots with filesystem-level consistency.
Description
The RenamingSnapshotCommit class provides a file-based atomic commit mechanism for Apache Paimon snapshots. It relies on the atomicity of file rename operations, which is guaranteed on local filesystems and HDFS but may require additional locking for object storage systems.
The commit process involves writing the snapshot metadata to a file using try_to_write_atomic, which ensures either the complete file is written or nothing is written. After successful snapshot commit, it updates a LATEST hint file to point to the new snapshot ID for fast lookup of the most recent snapshot.
The implementation handles commit failures gracefully and includes a best-effort update of the LATEST hint file, which logs warnings but doesn't fail the commit if the hint update fails. This design prioritizes snapshot durability over hint file consistency.
Usage
Use RenamingSnapshotCommit for file-based table storage on local filesystems or HDFS where atomic rename operations are available, or when catalog-based commits are not required or available.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/snapshot/renaming_snapshot_commit.py
Signature
class RenamingSnapshotCommit(SnapshotCommit):
"""A SnapshotCommit using file renaming to commit.
Note that when the file system is local or HDFS, rename is atomic.
But if the file system is object storage, we need additional lock protection.
"""
def __init__(self, snapshot_manager: SnapshotManager):
"""Initialize with snapshot manager."""
def commit(self, snapshot: Snapshot, branch: str, statistics: List[PartitionStatistics]) -> bool:
"""Commit the snapshot using file renaming."""
def close(self):
"""Close the lock and release resources."""
def _commit_latest_hint(self, snapshot_id: int):
"""Update the latest snapshot hint."""
Import
from pypaimon.snapshot.renaming_snapshot_commit import RenamingSnapshotCommit
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| snapshot_manager | SnapshotManager | Yes | Snapshot manager for file operations |
| snapshot | Snapshot | Yes | Snapshot to commit |
| branch | str | Yes | Branch name (currently unused but kept for interface compatibility) |
| statistics | List[PartitionStatistics] | Yes | Partition statistics (currently unused but kept for interface compatibility) |
Outputs
| Name | Type | Description |
|---|---|---|
| success | bool | True if commit was successful, False otherwise |
Usage Examples
from pypaimon.snapshot.renaming_snapshot_commit import RenamingSnapshotCommit
from pypaimon.snapshot.snapshot_manager import SnapshotManager
# Create snapshot manager
snapshot_manager = SnapshotManager(table)
# Create renaming snapshot commit
commit_handler = RenamingSnapshotCommit(snapshot_manager)
# Create and commit a snapshot
snapshot = create_new_snapshot()
success = commit_handler.commit(
snapshot=snapshot,
branch="main",
statistics=[]
)
if success:
print(f"Snapshot {snapshot.id} committed successfully")
else:
print("Snapshot already exists or commit failed")
# Close handler
commit_handler.close()
# Use with context manager
with RenamingSnapshotCommit(snapshot_manager) as committer:
committer.commit(snapshot, "main", [])
# Verify snapshot was committed
latest = snapshot_manager.get_latest_snapshot()
print(f"Latest snapshot ID: {latest.id}")