Implementation:Apache Paimon Ray Init
| Knowledge Sources | |
|---|---|
| Domains | Data_Lake, Distributed_Computing |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
External tool for initializing the Ray distributed computing framework for Paimon integration.
Description
ray.init() starts or connects to a Ray cluster runtime. When used with Paimon, it enables distributed table reads via to_ray() and distributed writes via write_ray(). Supports both local mode (single machine with multiple CPUs) and cluster mode (multi-node deployment).
Usage
Call ray.init() once at the beginning of any script that uses Paimon distributed operations. Use ignore_reinit_error=True for idempotent initialization in notebooks or scripts that may be re-executed.
Code Reference
Source Location
External Tool - ray.init() documentation
Signature
ray.init(
address: Optional[str] = None,
*,
num_cpus: Optional[int] = None,
num_gpus: Optional[int] = None,
ignore_reinit_error: bool = False,
**kwargs
) -> None
Import
import ray
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| address | Optional[str] | No | Ray cluster address; None for local mode |
| num_cpus | Optional[int] | No | Number of CPUs for local cluster |
| num_gpus | Optional[int] | No | Number of GPUs for local cluster |
| ignore_reinit_error | bool | No | Skip if already initialized (default False) |
Outputs
| Name | Type | Description |
|---|---|---|
| (return) | None | Initializes Ray runtime as a side effect |
Usage Examples
Basic Usage
import ray
# Initialize local Ray cluster with 4 CPUs
ray.init(ignore_reinit_error=True, num_cpus=4)