Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Paimon LookupStrategy

From Leeroopedia


Knowledge Sources
Domains Query Optimization, Data Access
Last Updated 2026-02-08 00:00 GMT

Overview

LookupStrategy encapsulates the decision logic for determining when key-based lookups are required during data processing.

Description

LookupStrategy is a configuration class that determines whether and why lookup operations are needed when processing data in Paimon. It consolidates four different conditions that may trigger the need for key-based lookups: first-row semantics, changelog production, deletion vector handling, and forced lookup mode.

The class uses a factory method pattern with a static `from` method that takes boolean flags for each condition and computes the derived `needLookup` field. A lookup is considered necessary if any of the four conditions is true, as each scenario requires accessing existing data to make correct processing decisions.

First-row semantics require lookups to identify and preserve only the first occurrence of each key. Changelog production needs lookups to determine the previous state for generating change records. Deletion vectors require lookups to check whether records have been marked for deletion. Force lookup mode explicitly requests lookup operations regardless of other conditions, useful for testing or specific data quality scenarios.

The public final fields allow efficient read access without method call overhead, which is important in performance-critical data processing paths. The immutable design ensures thread-safety when the strategy is shared across processing tasks.

Usage

Use LookupStrategy when configuring data readers or merge engines that need to decide whether key-based lookups are required. The strategy should be constructed based on table properties, query requirements, and processing mode. The `needLookup` field provides a fast check, while individual flags help optimize the specific lookup behavior.

Code Reference

Source Location

Signature

public class LookupStrategy {
    public final boolean needLookup;
    public final boolean isFirstRow;
    public final boolean produceChangelog;
    public final boolean deletionVector;

    private LookupStrategy(
        boolean isFirstRow,
        boolean produceChangelog,
        boolean deletionVector,
        boolean forceLookup)

    public static LookupStrategy from(
        boolean isFirstRow,
        boolean produceChangelog,
        boolean deletionVector,
        boolean forceLookup)
}

Import

import org.apache.paimon.lookup.LookupStrategy;

I/O Contract

Inputs

Name Type Required Description
isFirstRow boolean Yes Whether first-row semantics are required
produceChangelog boolean Yes Whether changelog generation is enabled
deletionVector boolean Yes Whether deletion vectors are in use
forceLookup boolean Yes Whether to force lookup regardless of other conditions

Outputs

Name Type Description
lookupStrategy LookupStrategy Configured strategy instance
needLookup boolean Whether any form of lookup is required
isFirstRow boolean First-row semantics flag
produceChangelog boolean Changelog production flag
deletionVector boolean Deletion vector flag

Usage Examples

// No lookup needed - append-only table
LookupStrategy appendOnly = LookupStrategy.from(
    false, // not first-row
    false, // no changelog
    false, // no deletion vectors
    false  // not forced
);
System.out.println(appendOnly.needLookup); // false

// Lookup needed for first-row semantics
LookupStrategy firstRow = LookupStrategy.from(
    true,  // first-row enabled
    false,
    false,
    false
);
System.out.println(firstRow.needLookup); // true
System.out.println(firstRow.isFirstRow); // true

// Lookup needed for changelog production
LookupStrategy changelog = LookupStrategy.from(
    false,
    true,  // produce changelog
    false,
    false
);
System.out.println(changelog.needLookup); // true
System.out.println(changelog.produceChangelog); // true

// Lookup needed for deletion vectors
LookupStrategy deletionVec = LookupStrategy.from(
    false,
    false,
    true,  // deletion vectors enabled
    false
);
System.out.println(deletionVec.needLookup); // true
System.out.println(deletionVec.deletionVector); // true

// Forced lookup for testing
LookupStrategy forced = LookupStrategy.from(
    false,
    false,
    false,
    true   // force lookup
);
System.out.println(forced.needLookup); // true

// Complex scenario - multiple conditions
LookupStrategy complex = LookupStrategy.from(
    true,  // first-row
    true,  // changelog
    false,
    false
);
System.out.println(complex.needLookup); // true

// Using in reader configuration
RecordReader reader;
if (strategy.needLookup) {
    if (strategy.deletionVector) {
        reader = new DeletionVectorReader(files, lookupTable);
    } else if (strategy.produceChangelog) {
        reader = new ChangelogReader(files, lookupTable);
    } else if (strategy.isFirstRow) {
        reader = new FirstRowReader(files, lookupTable);
    }
} else {
    reader = new SimpleReader(files);
}

// Optimization based on strategy
if (!strategy.needLookup) {
    // Can skip building lookup structures
    return readWithoutLookup(files);
} else {
    // Need to build and maintain lookup index
    LookupTable lookup = buildLookupTable();
    return readWithLookup(files, lookup, strategy);
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment