Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Paimon RenamingSnapshotCommit

From Leeroopedia


Knowledge Sources
Domains Snapshot Management, File System Operations
Last Updated 2026-02-08 00:00 GMT

Overview

RenamingSnapshotCommit is a SnapshotCommit implementation that uses atomic file renaming to commit snapshots with filesystem-level consistency.

Description

The RenamingSnapshotCommit class provides a file-based atomic commit mechanism for Apache Paimon snapshots. It relies on the atomicity of file rename operations, which is guaranteed on local filesystems and HDFS but may require additional locking for object storage systems.

The commit process involves writing the snapshot metadata to a file using try_to_write_atomic, which ensures either the complete file is written or nothing is written. After successful snapshot commit, it updates a LATEST hint file to point to the new snapshot ID for fast lookup of the most recent snapshot.

The implementation handles commit failures gracefully and includes a best-effort update of the LATEST hint file, which logs warnings but doesn't fail the commit if the hint update fails. This design prioritizes snapshot durability over hint file consistency.

Usage

Use RenamingSnapshotCommit for file-based table storage on local filesystems or HDFS where atomic rename operations are available, or when catalog-based commits are not required or available.

Code Reference

Source Location

Signature

class RenamingSnapshotCommit(SnapshotCommit):
    """A SnapshotCommit using file renaming to commit.

    Note that when the file system is local or HDFS, rename is atomic.
    But if the file system is object storage, we need additional lock protection.
    """

    def __init__(self, snapshot_manager: SnapshotManager):
        """Initialize with snapshot manager."""

    def commit(self, snapshot: Snapshot, branch: str, statistics: List[PartitionStatistics]) -> bool:
        """Commit the snapshot using file renaming."""

    def close(self):
        """Close the lock and release resources."""

    def _commit_latest_hint(self, snapshot_id: int):
        """Update the latest snapshot hint."""

Import

from pypaimon.snapshot.renaming_snapshot_commit import RenamingSnapshotCommit

I/O Contract

Inputs

Name Type Required Description
snapshot_manager SnapshotManager Yes Snapshot manager for file operations
snapshot Snapshot Yes Snapshot to commit
branch str Yes Branch name (currently unused but kept for interface compatibility)
statistics List[PartitionStatistics] Yes Partition statistics (currently unused but kept for interface compatibility)

Outputs

Name Type Description
success bool True if commit was successful, False otherwise

Usage Examples

from pypaimon.snapshot.renaming_snapshot_commit import RenamingSnapshotCommit
from pypaimon.snapshot.snapshot_manager import SnapshotManager

# Create snapshot manager
snapshot_manager = SnapshotManager(table)

# Create renaming snapshot commit
commit_handler = RenamingSnapshotCommit(snapshot_manager)

# Create and commit a snapshot
snapshot = create_new_snapshot()
success = commit_handler.commit(
    snapshot=snapshot,
    branch="main",
    statistics=[]
)

if success:
    print(f"Snapshot {snapshot.id} committed successfully")
else:
    print("Snapshot already exists or commit failed")

# Close handler
commit_handler.close()

# Use with context manager
with RenamingSnapshotCommit(snapshot_manager) as committer:
    committer.commit(snapshot, "main", [])

# Verify snapshot was committed
latest = snapshot_manager.get_latest_snapshot()
print(f"Latest snapshot ID: {latest.id}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment