Implementation:Apache Paimon PartitionStatistics
| Knowledge Sources | |
|---|---|
| Domains | Data Management, Statistics |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The PartitionStatistics class captures statistical information about a partition, with support for negative values to indicate data removal.
Description
PartitionStatistics is a serializable class that maintains comprehensive statistics for a table partition in Apache Paimon. It tracks essential metrics including the number of records, total file size in bytes, file count, the timestamp of the last file creation, and the total number of buckets. The partition specification (spec) maps partition key names to their values.
An important characteristic of this class is that its statistical fields may contain negative values, which indicate that some data has been removed from the partition. This design allows for tracking incremental changes to partition statistics, making it useful for maintaining delta statistics or tracking changes over time.
The class is annotated for JSON serialization using Jackson, making it compatible with REST APIs and distributed systems. It serves as the base class for the more feature-rich Partition class, providing the core statistical functionality.
Usage
Use PartitionStatistics when you need to track or report basic statistical information about a partition without additional metadata like completion status or audit information. It's particularly useful for reporting partition metrics, calculating storage usage, and understanding partition data distribution.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-api/src/main/java/org/apache/paimon/partition/PartitionStatistics.java
Signature
@JsonIgnoreProperties(ignoreUnknown = true)
@Public
public class PartitionStatistics implements Serializable {
private static final long serialVersionUID = 1L;
@JsonProperty(FIELD_SPEC)
protected final Map<String, String> spec;
@JsonProperty(FIELD_RECORD_COUNT)
protected final long recordCount;
@JsonProperty(FIELD_FILE_SIZE_IN_BYTES)
protected final long fileSizeInBytes;
@JsonProperty(FIELD_FILE_COUNT)
protected final long fileCount;
@JsonProperty(FIELD_LAST_FILE_CREATION_TIME)
protected final long lastFileCreationTime;
@JsonProperty(FIELD_TOTAL_BUCKETS)
protected final int totalBuckets;
}
Import
import org.apache.paimon.partition.PartitionStatistics;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| spec | Map<String, String> | Yes | Partition specification mapping partition keys to values |
| recordCount | long | Yes | Number of records (may be negative for removed data) |
| fileSizeInBytes | long | Yes | Total file size in bytes (may be negative for removed data) |
| fileCount | long | Yes | Number of files (may be negative for removed data) |
| lastFileCreationTime | long | Yes | Timestamp of the last file creation |
| totalBuckets | int | Yes | Total number of buckets (defaults to 0 if absent in older versions) |
Outputs
| Name | Type | Description |
|---|---|---|
| spec() | Map<String, String> | Returns the partition specification |
| recordCount() | long | Returns the number of records |
| fileSizeInBytes() | long | Returns the total file size in bytes |
| fileCount() | long | Returns the number of files |
| lastFileCreationTime() | long | Returns the last file creation timestamp |
| totalBuckets() | int | Returns the total number of buckets |
Usage Examples
// Create partition statistics for a daily partition
Map<String, String> partitionSpec = new HashMap<>();
partitionSpec.put("year", "2024");
partitionSpec.put("month", "02");
partitionSpec.put("day", "08");
PartitionStatistics stats = new PartitionStatistics(
partitionSpec,
150000L, // recordCount
1024L * 1024 * 750, // fileSizeInBytes (750 MB)
15L, // fileCount
System.currentTimeMillis(), // lastFileCreationTime
32 // totalBuckets
);
// Access statistics
System.out.println("Partition spec: " + stats.spec());
System.out.println("Total records: " + stats.recordCount());
System.out.println("Storage size: " +
stats.fileSizeInBytes() / (1024.0 * 1024.0) + " MB");
System.out.println("Number of files: " + stats.fileCount());
System.out.println("Bucket count: " + stats.totalBuckets());
// Example of negative statistics (data removal)
PartitionStatistics deltaStats = new PartitionStatistics(
partitionSpec,
-1000L, // 1000 records removed
-1024L * 1024 * 10, // 10 MB removed
-2L, // 2 files removed
System.currentTimeMillis(),
32
);
// Calculate combined statistics
long netRecords = stats.recordCount() + deltaStats.recordCount();
System.out.println("Net records after removal: " + netRecords);