Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Paimon Schema With Lance Format

From Leeroopedia


Knowledge Sources
Domains Data_Lake, Columnar_Storage
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for creating Paimon tables with Lance columnar file format via Schema options.

Description

To create a Lance-format table, pass the FILE_FORMAT_LANCE option in the Schema's options dictionary. CoreOptions.FILE_FORMAT.key() returns file.format and CoreOptions.FILE_FORMAT_LANCE holds the value lance. The Schema.from_pyarrow_schema() accepts these options to produce a Lance-configured schema for table creation.

The schema creation process validates the provided PyArrow schema, converts field types to Paimon's internal type system, and stores the Lance format configuration in the table metadata. Once the schema is created with Lance format, all subsequent read and write operations on the table will use the Lance file reader and writer.

Usage

Use this implementation when initializing a new Paimon table that should store its data files in Lance format. This is the entry point for all Lance-based analytical workflows in Paimon.

Code Reference

Source Location

  • Repository: Apache Paimon
  • File: paimon-python/pypaimon/schema/schema.py:L51-88
  • File: paimon-python/pypaimon/common/options/core_options.py:L56 (FILE_FORMAT_LANCE constant)
  • File: paimon-python/pypaimon/common/options/core_options.py:L118-123 (FILE_FORMAT option)

Signature

from pypaimon.common.options.core_options import CoreOptions

# Key constants:
# CoreOptions.FILE_FORMAT.key() -> 'file.format'
# CoreOptions.FILE_FORMAT_LANCE -> 'lance'

Schema.from_pyarrow_schema(
    pa_schema: pa.Schema,
    partition_keys: Optional[List[str]] = None,
    primary_keys: Optional[List[str]] = None,
    options: Optional[Dict] = None,  # Include {CoreOptions.FILE_FORMAT.key(): CoreOptions.FILE_FORMAT_LANCE}
    comment: Optional[str] = None,
) -> Schema

Import

from pypaimon.schema.schema import Schema
from pypaimon.common.options.core_options import CoreOptions

I/O Contract

Inputs

Name Type Required Description
pa_schema pa.Schema Yes PyArrow schema defining the table columns and their types
partition_keys Optional[List[str]] No List of column names to use as partition keys
primary_keys Optional[List[str]] No List of column names to use as primary keys
options Dict Yes Must include {file.format: lance} to enable Lance format
comment Optional[str] No Optional comment describing the table

Outputs

Name Type Description
schema Schema A Schema instance with Lance format configured, ready for table creation

Usage Examples

Basic Usage

import pyarrow as pa
from pypaimon.schema.schema import Schema
from pypaimon.common.options.core_options import CoreOptions

pa_schema = pa.schema([
    ('id', pa.int64()),
    ('name', pa.string()),
    ('value', pa.float64()),
    ('category', pa.string()),
])

schema = Schema.from_pyarrow_schema(
    pa_schema,
    options={CoreOptions.FILE_FORMAT.key(): CoreOptions.FILE_FORMAT_LANCE}
)

catalog.create_table('analytics.lance_table', schema, ignore_if_exists=True)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment