Implementation:Apache Paimon Timestamp
| Knowledge Sources | |
|---|---|
| Domains | Time Data, Data Types |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Timestamp provides an immutable, high-precision time representation with millisecond and sub-millisecond (nanosecond) components for timezone-free temporal data in Apache Paimon.
Description
The Timestamp class represents an instant in time as milliseconds since the Unix epoch (1970-01-01 00:00:00) plus an additional nanosecond component within that millisecond (0-999,999 nanoseconds). This two-part representation enables precise sub-millisecond timestamps while maintaining efficient storage for the common millisecond-precision case.
The class enforces immutability and validates that the nanosecond component stays within the valid range. It provides bidirectional conversion with Python's datetime objects through to_local_date_time() and from_local_date_time(), explicitly requiring timezone-naive datetimes to maintain the timezone-free semantics. The conversion carefully handles date arithmetic, including negative timestamps for dates before the epoch.
Timestamp supports multiple construction patterns: from_epoch_millis() for millisecond precision, from_micros() for microsecond precision (common in many systems), from_local_date_time() for datetime conversion, and now() for current time. It implements full comparison operators (__eq__, __lt__, __le__, __gt__, __ge__), making timestamps suitable for sorting and range queries. The to_micros() method enables interoperability with systems that use microsecond precision. String formatting produces human-readable ISO-style datetime strings.
Usage
Use Timestamp when representing temporal data in Paimon tables requiring sub-millisecond precision, converting between different time representations, or performing time-based comparisons and arithmetic in data processing.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/data/timestamp.py
Signature
class Timestamp:
"""
An internal data structure representing data of TimestampType.
This data structure is immutable and consists of a milliseconds and nanos-of-millisecond since
1970-01-01 00:00:00. It might be stored in a compact representation (as a long value) if
values are small enough.
This class represents timezone-free timestamps
"""
MILLIS_PER_DAY = 86400000
MICROS_PER_MILLIS = 1000
NANOS_PER_MICROS = 1000
NANOS_PER_HOUR = 3_600_000_000_000
NANOS_PER_MINUTE = 60_000_000_000
NANOS_PER_SECOND = 1_000_000_000
NANOS_PER_MICROSECOND = 1_000
def __init__(self, millisecond: int, nano_of_millisecond: int = 0):
pass
def get_millisecond(self) -> int:
pass
def get_nano_of_millisecond(self) -> int:
pass
def to_local_date_time(self) -> datetime:
pass
def to_millis_timestamp(self) -> 'Timestamp':
pass
def to_micros(self) -> int:
pass
@staticmethod
def now() -> 'Timestamp':
pass
@staticmethod
def from_epoch_millis(milliseconds: int, nanos_of_millisecond: int = 0) -> 'Timestamp':
pass
@staticmethod
def from_local_date_time(date_time: datetime) -> 'Timestamp':
pass
@staticmethod
def from_micros(micros: int) -> 'Timestamp':
pass
Import
from pypaimon.data.timestamp import Timestamp
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| millisecond | int | Yes | Milliseconds since Unix epoch |
| nano_of_millisecond | int | No | Nanoseconds within millisecond (0-999,999) |
| date_time | datetime | Yes | Python datetime (must be timezone-naive) |
| micros | int | Yes | Microseconds since Unix epoch |
Outputs
| Name | Type | Description |
|---|---|---|
| timestamp | Timestamp | Immutable timestamp instance |
| millisecond | int | Milliseconds component |
| nanos | int | Nanoseconds within millisecond |
| datetime | datetime | Converted Python datetime |
| micros | int | Total microseconds since epoch |
Usage Examples
from pypaimon.data.timestamp import Timestamp
from datetime import datetime
# Create from epoch milliseconds
ts = Timestamp.from_epoch_millis(1640000000000)
print(ts.get_millisecond()) # 1640000000000
print(ts.get_nano_of_millisecond()) # 0
# Create with nanosecond precision
ts = Timestamp.from_epoch_millis(1640000000000, 123456)
print(ts.get_nano_of_millisecond()) # 123456
# Create from datetime
dt = datetime(2024, 1, 1, 12, 30, 45, 500000) # 500000 microseconds
ts = Timestamp.from_local_date_time(dt)
print(ts) # Converts to timestamp
# Convert back to datetime
recovered_dt = ts.to_local_date_time()
print(recovered_dt) # 2024-01-01 12:30:45.500000
# Create from microseconds
micros = 1640000000000000 # Microseconds since epoch
ts = Timestamp.from_micros(micros)
print(ts.to_micros()) # 1640000000000000
# Get current timestamp
now = Timestamp.now()
print(f"Current time: {now}")
# Comparisons
ts1 = Timestamp.from_epoch_millis(1000)
ts2 = Timestamp.from_epoch_millis(2000)
ts3 = Timestamp.from_epoch_millis(1000, 500)
print(ts1 < ts2) # True
print(ts1 == ts1) # True
print(ts1 < ts3) # True (same millis, but ts3 has more nanos)
print(ts2 > ts1) # True
# Sorting
timestamps = [
Timestamp.from_epoch_millis(3000),
Timestamp.from_epoch_millis(1000),
Timestamp.from_epoch_millis(2000)
]
sorted_ts = sorted(timestamps)
print([ts.get_millisecond() for ts in sorted_ts]) # [1000, 2000, 3000]
# Sub-millisecond precision
ts1 = Timestamp.from_epoch_millis(1000, 100) # 1000ms + 100ns
ts2 = Timestamp.from_epoch_millis(1000, 200) # 1000ms + 200ns
print(ts1 < ts2) # True (differs by 100 nanoseconds)
# Convert to millisecond-only precision
ts = Timestamp.from_epoch_millis(1000, 999999)
millis_only = ts.to_millis_timestamp()
print(millis_only.get_nano_of_millisecond()) # 0
# String representation
ts = Timestamp.from_local_date_time(datetime(2024, 6, 15, 14, 30, 0))
print(str(ts)) # "2024-06-15 14:30:00.000000"
# Handle dates before epoch (negative timestamps)
old_date = datetime(1960, 1, 1, 0, 0, 0)
ts = Timestamp.from_local_date_time(old_date)
print(ts.get_millisecond()) # Negative value
recovered = ts.to_local_date_time()
print(recovered) # 1960-01-01 00:00:00
# Validation errors
try:
Timestamp.from_epoch_millis(1000, 1000000) # Too many nanos
except ValueError as e:
print(f"Error: {e}")
try:
dt_with_tz = datetime(2024, 1, 1, tzinfo=...)
Timestamp.from_local_date_time(dt_with_tz) # Has timezone
except ValueError as e:
print(f"Error: {e}")
# Hash and equality for use in sets/dicts
ts_set = {
Timestamp.from_epoch_millis(1000),
Timestamp.from_epoch_millis(2000),
Timestamp.from_epoch_millis(1000) # Duplicate
}
print(len(ts_set)) # 2 (duplicate removed)