Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Apache Beam Portable Runner Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Portability
Last Updated 2026-02-09 04:00 GMT

Overview

Portability framework runtime requiring gRPC, a Job Service endpoint (local or remote), and either Docker or loopback environment for SDK harness execution.

Description

This environment supports the Beam Portability Framework, which enables language-independent pipeline execution. The `PortableRunner` translates pipelines to protobuf format and submits them to a Job Service via gRPC. The Job Service manages job lifecycle (prepare → run → monitor) and coordinates artifact staging. SDK harnesses can run in Docker containers or via a loopback connection for local development. The `InMemoryJobService` provides a lightweight in-process job service, while external job services (Flink, Spark) run as separate processes.

Usage

Use this environment when running cross-language pipelines, using the Flink or Spark portable runners, or developing custom portable runners. It is the mandatory prerequisite for `PortableRunner`, `InMemoryJobService`, `PortablePipelineJarCreator`, and `JobServicePipelineResult`.

System Requirements

Category Requirement Notes
Runtime JVM 11+ Job server Docker images use `eclipse-temurin:11`
Network gRPC connectivity to Job Service Default uses localhost; remote requires network access
Container Runtime Docker (for DOCKER environment type) Not required for LOOPBACK mode
OS Linux or macOS Docker mode requires Linux containers

Dependencies

System Packages

  • `eclipse-temurin` JDK 11 (for job server containers)
  • `libltdl7` (required by job server Docker images)
  • `docker` (for DOCKER environment type)
  • `python3` + virtualenv (for Python SDK harness in portability tests)

Java Packages

  • `beam-runners-portability-java` (portable runner)
  • `beam-runners-java-job-service` (in-memory job service)
  • gRPC libraries (for Job Service API and Artifact Staging)
  • Protobuf libraries (for pipeline serialization)

Credentials

  • No credentials required for local/loopback mode.
  • For remote Job Service connections, network-level access is required.
  • For Docker SDK harness mode, Docker daemon access is required.

Quick Install

# Build the portable runner and job service:
./gradlew :runners:portability:java:build
./gradlew :runners:java-job-service:build

# Run with loopback (no Docker required):
--environment_type=LOOPBACK

# Run with Docker SDK harness:
--environment_type=DOCKER

Code Evidence

gRPC Job Service connection from `PortableRunner.java:174-177`:

ManagedChannel jobServiceChannel =
    ManagedChannelBuilder.forTarget(
        options.getJobEndpoint())
        .usePlaintext()
        .build();

Job preparation via gRPC from `InMemoryJobService.java:168-216`:

public void prepare(
    JobApi.PrepareJobRequest request,
    StreamObserver<JobApi.PrepareJobResponse> responseObserver) {
  // Validates pipeline, creates staging token, returns preparation ID
}

Artifact staging via JAR packaging from `PortablePipelineJarCreator.java:89-113`:

public PortablePipelineResult run(Pipeline pipeline) {
  // Packages pipeline proto + artifacts into self-executing JAR
}

Common Errors

Error Message Cause Solution
gRPC connection refused to Job Service Job Service not running or wrong endpoint Start Job Service or verify `--jobEndpoint` configuration
Docker SDK harness failed to start Docker not installed or daemon not running Install Docker and ensure daemon is running; or use `LOOPBACK` mode
Artifact staging timeout Large dependencies or slow network Increase staging timeout; consider using `PortablePipelineJarCreator` for offline packaging

Compatibility Notes

  • LOOPBACK mode: SDK harness runs in the same process; useful for development and testing. Not suitable for production cross-language pipelines.
  • DOCKER mode: Requires Docker daemon; SDK harness runs in separate containers. Required for production cross-language support.
  • Job Server Containers: Both Flink and Spark job servers use `eclipse-temurin:11` with `libltdl7`.
  • Python SDK: Portability tests require Python SDK installation in a virtualenv.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment