Implementation:Apache Paimon MemorySize Python
| Knowledge Sources | |
|---|---|
| Domains | Configuration, Memory Management |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
MemorySize provides a type-safe representation of memory sizes with support for parsing human-readable strings and converting between different units (bytes, KB, MB, GB, TB).
Description
The MemorySize class encapsulates memory size values as byte counts while providing convenient methods for viewing and manipulating those values in different units. It uses binary units (1024-based) following the kibibyte (KiB), mebibyte (MiB), gibibyte (GiB), and tebibyte (TiB) conventions, though it labels them with the more familiar KB, MB, GB, TB abbreviations in output.
The class supports parsing human-readable memory size strings like "512mb", "2gb", or "100" (defaults to bytes) through the parse() static method. The MemoryUnit companion class defines the recognized unit abbreviations and their multipliers, supporting multiple formats per unit (e.g., "m", "mb", "mebibytes" all mean the same thing). Parsing is case-insensitive and flexible about whitespace.
MemorySize provides conversion methods (get_bytes(), get_kibi_bytes(), get_mebi_bytes(), etc.) using efficient bit shifting operations, factory methods for creating instances from different units (of_mebi_bytes(), of_kibi_bytes()), and proper implementation of comparison operators and string formatting. The format_to_string() method intelligently chooses the highest unit that results in an integer value, producing clean output like "2 gb" instead of "2048 mb".
Usage
Use MemorySize when parsing memory configuration values from strings, representing memory limits in a type-safe manner, or performing unit conversions in configuration and resource management code.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/common/memory_size.py
Signature
class MemorySize:
"""MemorySize is a representation of a number of bytes, viewable in different units."""
ZERO = None
MAX_VALUE = None
def __init__(self, bytes: int):
pass
@staticmethod
def of_mebi_bytes(mebi_bytes: int) -> 'MemorySize':
pass
@staticmethod
def of_kibi_bytes(kibi_bytes: int) -> 'MemorySize':
pass
@staticmethod
def of_bytes(bytes: int) -> 'MemorySize':
pass
def get_bytes(self) -> int:
pass
def get_kibi_bytes(self) -> int:
pass
def get_mebi_bytes(self) -> int:
pass
def get_gibi_bytes(self) -> int:
pass
def get_tebi_bytes(self) -> int:
pass
def format_to_string(self) -> str:
pass
@staticmethod
def parse(text: str) -> 'MemorySize':
pass
@staticmethod
def parse_bytes(text: str) -> int:
pass
class MemoryUnit:
"""Enum which defines memory unit, mostly used to parse value from configuration file."""
def __init__(self, units: list, multiplier: int):
pass
BYTES = None
KILO_BYTES = None
MEGA_BYTES = None
GIGA_BYTES = None
TERA_BYTES = None
@staticmethod
def has_unit(text: str) -> bool:
pass
Import
from pypaimon.common.memory_size import MemorySize, MemoryUnit
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| bytes | int | Yes | Memory size in bytes |
| text | str | Yes | Human-readable memory size string |
| mebi_bytes | int | Yes | Memory size in mebibytes (MB) |
| kibi_bytes | int | Yes | Memory size in kibibytes (KB) |
Outputs
| Name | Type | Description |
|---|---|---|
| memory_size | MemorySize | MemorySize instance |
| bytes | int | Memory size in bytes |
| formatted | str | Human-readable formatted string |
Usage Examples
from pypaimon.common.memory_size import MemorySize, MemoryUnit
# Parse from strings
size1 = MemorySize.parse("512mb")
print(size1.get_mebi_bytes()) # 512
size2 = MemorySize.parse("2 GB")
print(size2.get_gibi_bytes()) # 2
size3 = MemorySize.parse("1024") # No unit = bytes
print(size3.get_bytes()) # 1024
# Create from different units
size = MemorySize.of_mebi_bytes(256)
print(f"Bytes: {size.get_bytes()}") # 268435456
print(f"KB: {size.get_kibi_bytes()}") # 262144
print(f"MB: {size.get_mebi_bytes()}") # 256
size = MemorySize.of_kibi_bytes(1024)
print(f"MB: {size.get_mebi_bytes()}") # 1
# Formatting
size = MemorySize.of_mebi_bytes(2048)
print(size.format_to_string()) # "2 gb"
print(str(size)) # "2 gb"
size = MemorySize.of_kibi_bytes(512)
print(size.format_to_string()) # "512 kb"
# Comparisons
size1 = MemorySize.parse("1gb")
size2 = MemorySize.parse("1024mb")
print(size1 == size2) # True
print(size1 >= size2) # True
size3 = MemorySize.parse("500mb")
print(size3 < size1) # True
# Constants
print(MemorySize.ZERO.get_bytes()) # 0
print(MemorySize.MAX_VALUE.get_bytes()) # 9223372036854775807
# Unit checking
print(MemoryUnit.has_unit("512mb")) # True
print(MemoryUnit.has_unit("512")) # False
# Supported unit formats
size = MemorySize.parse("100 k") # Works
size = MemorySize.parse("100 kb") # Works
size = MemorySize.parse("100 kibibytes") # Works
print(size.get_kibi_bytes()) # 100
# Error handling
try:
MemorySize.parse("invalid")
except ValueError as e:
print(f"Parse error: {e}")
try:
MemorySize(-100) # Negative not allowed
except ValueError as e:
print(f"Error: {e}")
# Use in configuration
def configure_buffer(buffer_size: str):
size = MemorySize.parse(buffer_size)
if size < MemorySize.of_mebi_bytes(10):
print("Warning: Buffer size too small")
buffer_bytes = size.get_bytes()
# Use buffer_bytes...
configure_buffer("64mb")