Implementation:Apache Paimon PredicateBuilder Filtering
| Knowledge Sources | |
|---|---|
| Domains | Data_Lake, Distributed_Computing |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tools for building predicate filters and column projections on Paimon table reads.
Description
PredicateBuilder provides a fluent API for constructing filter predicates. It supports 15 comparison methods (equal, not_equal, less_than, greater_than, between, is_in, startswith, contains, etc.) and logical combinators (and_predicates, or_predicates). ReadBuilder.with_filter() accepts predicates and with_projection() accepts column lists. When used before to_ray(), these filters are applied during scan planning so only relevant data is distributed to Ray workers.
Usage
Obtain a PredicateBuilder from read_builder.new_predicate_builder(), construct predicates using comparison methods, combine them with logical operators, and apply them to the read builder using with_filter() and with_projection().
Code Reference
Source Location
paimon-python/pypaimon/common/predicate_builder.py:L25-131paimon-python/pypaimon/read/read_builder.py:L30-86
Signature
class PredicateBuilder:
def __init__(self, row_field: List[DataField]):
def equal(self, field: str, literal: Any) -> Predicate:
def not_equal(self, field: str, literal: Any) -> Predicate:
def less_than(self, field: str, literal: Any) -> Predicate:
def greater_than(self, field: str, literal: Any) -> Predicate:
def less_or_equal(self, field: str, literal: Any) -> Predicate:
def greater_or_equal(self, field: str, literal: Any) -> Predicate:
def between(self, field: str, lower: Any, upper: Any) -> Predicate:
def is_in(self, field: str, literals: List[Any]) -> Predicate:
def is_null(self, field: str) -> Predicate:
def is_not_null(self, field: str) -> Predicate:
def startswith(self, field: str, pattern: Any) -> Predicate:
def endswith(self, field: str, pattern: Any) -> Predicate:
def contains(self, field: str, pattern: Any) -> Predicate:
@staticmethod
def and_predicates(predicates: List[Predicate]) -> Optional[Predicate]:
@staticmethod
def or_predicates(predicates: List[Predicate]) -> Optional[Predicate]:
class ReadBuilder:
def with_filter(self, predicate: Predicate) -> 'ReadBuilder':
def with_projection(self, projection: List[str]) -> 'ReadBuilder':
def new_predicate_builder(self) -> PredicateBuilder:
Import
from pypaimon.common.predicate_builder import PredicateBuilder
from pypaimon.read.read_builder import ReadBuilder
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| field | str | Yes | Column name for predicate |
| literal | Any | Yes | Comparison value (type must match column type) |
| lower | Any | Yes (for between) | Lower bound for range predicate |
| upper | Any | Yes (for between) | Upper bound for range predicate |
| literals | List[Any] | Yes (for is_in) | List of values for membership predicate |
| pattern | Any | Yes (for string ops) | Pattern for startswith, endswith, contains |
| predicates | List[Predicate] | Yes (for combinators) | List of predicates to combine with AND/OR |
| projection | List[str] | Yes (for with_projection) | Column names to select |
Outputs
| Name | Type | Description |
|---|---|---|
| predicate | Predicate | Predicate object for chaining or applying to ReadBuilder |
| read_builder | ReadBuilder | Configured ReadBuilder with filter and/or projection applied |
Usage Examples
Basic Usage
read_builder = table.new_read_builder()
pb = read_builder.new_predicate_builder()
# Build compound predicate
predicate = pb.and_predicates([
pb.equal('category', 'electronics'),
pb.greater_than('price', 100),
])
# Apply filter and projection
read_builder = read_builder.with_filter(predicate)
read_builder = read_builder.with_projection(['id', 'name', 'price'])