Implementation:Apache Paimon LookupStrategy
| Knowledge Sources | |
|---|---|
| Domains | Query Optimization, Data Access |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
LookupStrategy encapsulates the decision logic for determining when key-based lookups are required during data processing.
Description
LookupStrategy is a configuration class that determines whether and why lookup operations are needed when processing data in Paimon. It consolidates four different conditions that may trigger the need for key-based lookups: first-row semantics, changelog production, deletion vector handling, and forced lookup mode.
The class uses a factory method pattern with a static `from` method that takes boolean flags for each condition and computes the derived `needLookup` field. A lookup is considered necessary if any of the four conditions is true, as each scenario requires accessing existing data to make correct processing decisions.
First-row semantics require lookups to identify and preserve only the first occurrence of each key. Changelog production needs lookups to determine the previous state for generating change records. Deletion vectors require lookups to check whether records have been marked for deletion. Force lookup mode explicitly requests lookup operations regardless of other conditions, useful for testing or specific data quality scenarios.
The public final fields allow efficient read access without method call overhead, which is important in performance-critical data processing paths. The immutable design ensures thread-safety when the strategy is shared across processing tasks.
Usage
Use LookupStrategy when configuring data readers or merge engines that need to decide whether key-based lookups are required. The strategy should be constructed based on table properties, query requirements, and processing mode. The `needLookup` field provides a fast check, while individual flags help optimize the specific lookup behavior.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-api/src/main/java/org/apache/paimon/lookup/LookupStrategy.java
Signature
public class LookupStrategy {
public final boolean needLookup;
public final boolean isFirstRow;
public final boolean produceChangelog;
public final boolean deletionVector;
private LookupStrategy(
boolean isFirstRow,
boolean produceChangelog,
boolean deletionVector,
boolean forceLookup)
public static LookupStrategy from(
boolean isFirstRow,
boolean produceChangelog,
boolean deletionVector,
boolean forceLookup)
}
Import
import org.apache.paimon.lookup.LookupStrategy;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| isFirstRow | boolean | Yes | Whether first-row semantics are required |
| produceChangelog | boolean | Yes | Whether changelog generation is enabled |
| deletionVector | boolean | Yes | Whether deletion vectors are in use |
| forceLookup | boolean | Yes | Whether to force lookup regardless of other conditions |
Outputs
| Name | Type | Description |
|---|---|---|
| lookupStrategy | LookupStrategy | Configured strategy instance |
| needLookup | boolean | Whether any form of lookup is required |
| isFirstRow | boolean | First-row semantics flag |
| produceChangelog | boolean | Changelog production flag |
| deletionVector | boolean | Deletion vector flag |
Usage Examples
// No lookup needed - append-only table
LookupStrategy appendOnly = LookupStrategy.from(
false, // not first-row
false, // no changelog
false, // no deletion vectors
false // not forced
);
System.out.println(appendOnly.needLookup); // false
// Lookup needed for first-row semantics
LookupStrategy firstRow = LookupStrategy.from(
true, // first-row enabled
false,
false,
false
);
System.out.println(firstRow.needLookup); // true
System.out.println(firstRow.isFirstRow); // true
// Lookup needed for changelog production
LookupStrategy changelog = LookupStrategy.from(
false,
true, // produce changelog
false,
false
);
System.out.println(changelog.needLookup); // true
System.out.println(changelog.produceChangelog); // true
// Lookup needed for deletion vectors
LookupStrategy deletionVec = LookupStrategy.from(
false,
false,
true, // deletion vectors enabled
false
);
System.out.println(deletionVec.needLookup); // true
System.out.println(deletionVec.deletionVector); // true
// Forced lookup for testing
LookupStrategy forced = LookupStrategy.from(
false,
false,
false,
true // force lookup
);
System.out.println(forced.needLookup); // true
// Complex scenario - multiple conditions
LookupStrategy complex = LookupStrategy.from(
true, // first-row
true, // changelog
false,
false
);
System.out.println(complex.needLookup); // true
// Using in reader configuration
RecordReader reader;
if (strategy.needLookup) {
if (strategy.deletionVector) {
reader = new DeletionVectorReader(files, lookupTable);
} else if (strategy.produceChangelog) {
reader = new ChangelogReader(files, lookupTable);
} else if (strategy.isFirstRow) {
reader = new FirstRowReader(files, lookupTable);
}
} else {
reader = new SimpleReader(files);
}
// Optimization based on strategy
if (!strategy.needLookup) {
// Can skip building lookup structures
return readWithoutLookup(files);
} else {
// Need to build and maintain lookup index
LookupTable lookup = buildLookupTable();
return readWithLookup(files, lookup, strategy);
}