Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Beam Twister2PipelineOptions

From Leeroopedia


Attribute Value
Implementation Name Twister2PipelineOptions (Pattern Doc)
Domain Configuration_Management, HPC
Overview Concrete configuration interface and auto-service registrar for Twister2 pipeline options
Type Pattern Doc (interface + registration pattern)
Deprecation Notice The Twister2 Runner is deprecated and scheduled for removal in Apache Beam 3.0
last_updated 2026-02-09 04:00 GMT

Overview

Twister2PipelineOptions is a pattern documentation page covering the configuration interface and auto-service registrar for the Twister2 runner. Rather than documenting a single API endpoint, this page describes the combined pattern of an options interface (Twister2PipelineOptions) and its companion service registrar (Twister2RunnerRegistrar) that together enable Twister2 runner discovery and configuration.

Note: The Twister2 Runner is deprecated and is scheduled for removal in Apache Beam 3.0. Users should plan migration to an actively maintained runner.

Code Reference

Source Location

File Lines Repository
runners/twister2/src/main/java/org/apache/beam/runners/twister2/Twister2PipelineOptions.java L29-77 GitHub
runners/twister2/src/main/java/org/apache/beam/runners/twister2/Twister2RunnerRegistrar.java L33-53 GitHub

Signature: Twister2PipelineOptions

public interface Twister2PipelineOptions
    extends PipelineOptions, StreamingOptions, FileStagingOptions {

  @Description("set parallelism for Twister2 processor")
  @Default.Integer(1)
  int getParallelism();
  void setParallelism(int parallelism);

  @Description("Twister2 TSetEnvironment")
  @JsonIgnore
  TSetEnvironment getTSetEnvironment();
  void setTSetEnvironment(TSetEnvironment environment);

  @Description("Twister2 cluster type, supported types: standalone, nomad, kubernetes, mesos")
  @Default.String("standalone")
  String getClusterType();
  void setClusterType(String name);

  @Description("Job file zip")
  String getJobFileZip();
  void setJobFileZip(String pathToZip);

  @Description("Job type, jar or java_zip")
  @Default.String("java_zip")
  String getJobType();
  void setJobType(String jobType);

  @Description("Twister2 home directory")
  String getTwister2Home();
  void setTwister2Home(String twister2Home);

  @Description("CPU's per worker")
  @Default.Integer(2)
  int getWorkerCPUs();
  void setWorkerCPUs(int workerCPUs);

  @Description("RAM allocated per worker")
  @Default.Integer(2048)
  int getRamMegaBytes();
  void setRamMegaBytes(int ramMegaBytes);
}

Signature: Twister2RunnerRegistrar

public class Twister2RunnerRegistrar {
  private Twister2RunnerRegistrar() {}

  @AutoService(PipelineRunnerRegistrar.class)
  public static class Runner implements PipelineRunnerRegistrar {
    @Override
    public Iterable<Class<? extends PipelineRunner<?>>> getPipelineRunners() {
      return ImmutableList.of(Twister2Runner.class, Twister2TestRunner.class);
    }
  }

  @AutoService(PipelineOptionsRegistrar.class)
  public static class Options implements PipelineOptionsRegistrar {
    @Override
    public Iterable<Class<? extends PipelineOptions>> getPipelineOptions() {
      return ImmutableList.of(Twister2PipelineOptions.class);
    }
  }
}

Import Statements

import org.apache.beam.runners.twister2.Twister2PipelineOptions;
import org.apache.beam.runners.twister2.Twister2RunnerRegistrar;

I/O Contract

Inputs

Input Type Description
User-provided pipeline options Command-line arguments or programmatic configuration Twister2-specific parameters such as parallelism, cluster type, Twister2 home, worker CPUs, and RAM
@AutoService metadata Compile-time generated META-INF/services entries created automatically by the AutoService annotation processor

Outputs

Output Type Description
Configured Twister2PipelineOptions Interface instance Fully configured options object accessible by the runner
Runner registration Service metadata Twister2Runner and Twister2TestRunner registered as discoverable pipeline runners
Options registration Service metadata Twister2PipelineOptions registered as a recognized options interface

Usage Examples

Command-Line Usage

# Run a Beam pipeline on a standalone Twister2 cluster
mvn exec:java -Dexec.mainClass=com.example.MyPipeline \
  -Dexec.args="--runner=Twister2Runner \
    --parallelism=8 \
    --clusterType=standalone \
    --twister2Home=/opt/twister2 \
    --workerCPUs=4 \
    --ramMegaBytes=4096"

Programmatic Usage

// Create Twister2-specific options
Twister2PipelineOptions options = PipelineOptionsFactory
    .fromArgs(args)
    .withValidation()
    .as(Twister2PipelineOptions.class);

// Configure Twister2-specific parameters
options.setParallelism(4);
options.setClusterType("kubernetes");
options.setTwister2Home("/opt/twister2");
options.setWorkerCPUs(2);
options.setRamMegaBytes(2048);

// Create and run pipeline
Pipeline pipeline = Pipeline.create(options);
// ... add transforms ...
pipeline.run().waitUntilFinish();

Local Mode (No Twister2 Installation)

// When twister2Home is not set, the runner uses local mode
Twister2PipelineOptions options = PipelineOptionsFactory.as(Twister2PipelineOptions.class);
// twister2Home is null -> local mode, parallelism forced to 1
Pipeline pipeline = Pipeline.create(options);
pipeline.run();

Pattern Details

The Twister2PipelineOptions interface and Twister2RunnerRegistrar together implement the Service Provider Interface (SPI) + Configuration pattern:

  1. Interface declaration -- Twister2PipelineOptions extends PipelineOptions to add runner-specific configuration with @Default annotations for sensible defaults
  2. Runner registration -- Twister2RunnerRegistrar.Runner annotated with @AutoService(PipelineRunnerRegistrar.class) registers the runner classes
  3. Options registration -- Twister2RunnerRegistrar.Options annotated with @AutoService(PipelineOptionsRegistrar.class) registers the options interface
  4. Discovery -- At runtime, PipelineOptionsFactory uses Java's ServiceLoader to discover all registered runners and options

The @JsonIgnore annotation on getTSetEnvironment() is significant: the TSetEnvironment is a runtime-only object that cannot be serialized as part of the pipeline options JSON representation. It is set programmatically during pipeline execution setup.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment