Environment:ArroyoSystems Arroyo Kubernetes Deployment

Knowledge Sources	Arroyo Helm Chart
Domains	Infrastructure, Kubernetes
Last Updated	2026-02-08 08:00 GMT

Overview

Kubernetes cluster environment with Helm chart for deploying Arroyo in distributed mode with dynamic worker pod scheduling.

Description

This environment provides the Kubernetes infrastructure for running Arroyo in distributed mode. The Arroyo controller uses the Kubernetes API to dynamically schedule worker pods based on pipeline parallelism requirements. The Helm chart deploys the controller (which includes the API, controller, and compiler services) as a Deployment, with workers spawned as individual pods on demand. Resource allocation supports two modes: per-slot (resources scale with task count) and per-pod (fixed resources per pod). The chart includes RBAC roles for pod management and configurable service accounts.

Usage

Use this environment for production distributed deployments of Arroyo. The Kubernetes scheduler is activated by setting `controller.scheduler = "kubernetes"`. Workers are created and destroyed dynamically as pipelines start and stop. Each worker pod runs the Arroyo worker binary and connects back to the controller via gRPC.

System Requirements

Category	Requirement	Notes
Kubernetes	1.24+	k8s-openapi 0.24.0 compatibility
Helm	3.x	For chart installation
Container Runtime	Docker/containerd	Standard K8s runtime
CPU per worker slot	900m (default)	Configurable via Helm values
Memory per worker slot	500Mi (default)	Configurable via Helm values

Dependencies

Kubernetes Resources

ServiceAccount (for pod management)
Role/RoleBinding (pods create/delete/get/list/watch)
Deployment (controller)
ConfigMap (configuration)
Service (API, gRPC endpoints)

Container Images

`ghcr.io/arroyosystems/arroyo:latest` (default for both controller and workers)

Credentials

`ARROYO__CONTROLLER__SCHEDULER`: Set to `kubernetes` to enable K8s scheduler
Kubernetes ServiceAccount with pod management RBAC permissions
Cloud storage credentials (see Object_Storage environment) for checkpoint access from worker pods

Quick Install

# Install via Helm
helm repo add arroyo https://arroyosystems.github.io/helm-charts
helm install arroyo arroyo/arroyo

# Or from source
helm install arroyo ./k8s/arroyo \
  --set config.scheduler=kubernetes \
  --set config.checkpointUrl=s3://my-bucket/checkpoints

Code Evidence

Default worker configuration from `default.toml:62-76`:

[kubernetes-scheduler]
namespace = "default"
resource-mode = "per-slot"

[kubernetes-scheduler.worker]
name-prefix = "arroyo"
image = "ghcr.io/arroyosystems/arroyo:latest"
image-pull-policy = "IfNotPresent"
service-account-name = "default"
resources = { requests = { cpu = "900m",  memory = "500Mi" } }
task-slots = 16
command = "/app/arroyo worker"

Resource mode options from `config.rs:596-605`:

pub enum ResourceMode {
    /// In per-slot mode, tasks are packed onto workers up to the
    /// `task-slots` config, and for each slot the amount of resources
    /// specified in `resources` is provided
    PerSlot,
    /// In per-pod mode, every pod has exactly `task-slots` slots,
    /// and exactly the resources in `resources`, even if it is
    /// scheduled for fewer slots.
    PerPod,
}

Scheduler types from `config.rs:581-588`:

pub enum Scheduler {
    Embedded,   // In-process (for local mode)
    Process,    // Separate OS processes
    Node,       // Arroyo node service
    Kubernetes, // K8s pod scheduler
}

Common Errors

Error Message	Cause	Solution
`pods is forbidden: User cannot create resource "pods"`	Missing RBAC permissions	Verify ServiceAccount and Role/RoleBinding
`ImagePullBackOff`	Cannot pull worker image	Check image registry access and image name
Worker pods stuck in `Pending`	Insufficient cluster resources	Scale cluster or reduce `resources.requests`
`connection refused` from worker to controller	Network policy blocking gRPC	Ensure port 5116 accessible between pods

Compatibility Notes

Resource modes: `per-slot` (default) scales resources linearly with task count. `per-pod` gives fixed resources regardless of task count (legacy behavior from before 0.11).
Worker ports: Workers use fixed ports in K8s mode: RPC=6900, Admin=6901 (vs random ports in process mode).
Image consistency: Worker pods must use the same image version as the controller to avoid protocol mismatches.
Node selectors and tolerations: Configurable via `kubernetes-scheduler.worker.node-selector` and `tolerations` for scheduling constraints.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment