Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Paimon CatalogFactory Create for Ray

From Leeroopedia


Knowledge Sources
Domains Data_Lake, Distributed_Computing
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for creating Paimon catalog instances in preparation for Ray distributed operations.

Description

Uses CatalogFactory.create() to establish a catalog connection, then catalog.get_table() to obtain a table reference. In the Ray context, this setup occurs on the driver node before distributing read/write tasks to workers. The same CatalogFactory.create() API is used, but the resulting table is used for Ray-specific operations (to_ray(), write_ray()).

Usage

Call CatalogFactory.create() with the appropriate catalog options dictionary, then call catalog.get_table() with the fully qualified table identifier (e.g., db.table). The resulting table reference is then used to create read builders or write builders for Ray operations.

Code Reference

Source Location

paimon-python/pypaimon/catalog/catalog_factory.py:L28-44

Signature

class CatalogFactory:
    @staticmethod
    def create(catalog_options: Dict) -> Catalog:

class Catalog(ABC):
    @abstractmethod
    def get_table(self, identifier: Union[str, Identifier]) -> 'Table':

Import

from pypaimon.catalog.catalog_factory import CatalogFactory

I/O Contract

Inputs

Name Type Required Description
catalog_options Dict Yes Configuration with 'metastore', 'uri', 'warehouse', etc.
identifier Union[str, Identifier] Yes Table name (e.g., 'db.table')

Outputs

Name Type Description
catalog Catalog Catalog instance connected to the configured metastore
table FileStoreTable Table reference for read/write operations

Usage Examples

Basic Usage

from pypaimon.catalog.catalog_factory import CatalogFactory

catalog_options = {
    'metastore': 'rest',
    'uri': 'http://localhost:8080',
    'token': 'my-token',
}
catalog = CatalogFactory.create(catalog_options)
table = catalog.get_table('my_db.my_table')

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment