Environment:ArroyoSystems Arroyo Rust Runtime
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Stream_Processing |
| Last Updated | 2026-02-08 08:00 GMT |
Overview
Rust 2024 edition runtime with Arrow 55.2.0, DataFusion 48.0.1, and Tonic 0.13 gRPC framework for the Arroyo distributed stream processing engine.
Description
This environment provides the complete Rust toolchain and dependency set required to build and run Arroyo. The system is built on top of the Rust 2024 edition and uses a workspace-based Cargo project. Key frameworks include Apache Arrow for columnar data processing, DataFusion for SQL query execution, Tonic for gRPC inter-service communication, Tokio for the async runtime, and Axum for the REST API layer. The build uses custom forks of several upstream crates (arrow-rs, arrow-datafusion, sqlparser-rs, cornucopia) maintained by Arroyo Systems.
Usage
Use this environment for building from source, developing, or running the Arroyo stream processing engine. This is the mandatory prerequisite for all Arroyo server components (API, Controller, Compiler, Worker, Node).
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (Debian/Ubuntu preferred) | Docker builds use `rust:1-bookworm` / `debian:bookworm-slim` |
| Rust | Edition 2024 (Rust 1.84+) | Workspace resolver v2 |
| CPU | x86_64 or aarch64 | jemalloc allocator used on these architectures (non-MSVC) |
| Disk | 10GB+ SSD | For build artifacts, dependencies, and checkpoint storage |
Dependencies
System Packages
- `protoc` (Protocol Buffers compiler)
- `cmake`
- `clang` (auto-installable via config `compiler.install-clang = true`)
- `libssl-dev` / `openssl`
- `libsasl2-dev` (for Kafka SASL authentication)
- `pkg-config`
- `build-essential`
- `curl`
- `wget`
Rust Crate Dependencies (Key)
- `arrow` = 55.2.0 (Apache Arrow columnar format)
- `datafusion` = 48.0.1 (SQL query engine, custom Arroyo fork)
- `parquet` = 55.2.0 (Parquet file format, custom Arroyo fork)
- `tonic` = 0.13 (gRPC framework with zstd, TLS)
- `tokio` = 1.x (async runtime)
- `axum` = 0.7 (HTTP framework)
- `rustls` = 0.23.27 (TLS implementation)
- `prost` = 0.13 (Protocol Buffers)
- `object_store` = 0.12.3 (cloud storage abstraction)
- `sqlparser` = 0.55.0 (SQL parser, custom Arroyo fork)
- `prometheus` = 0.14.0 (metrics)
- `k8s-openapi` = 0.24.0 (Kubernetes API)
- `deltalake` = 0.27.0 (Delta Lake integration)
- `rusqlite` = 0.31 (SQLite driver)
- `deadpool-postgres` = 0.14 (PostgreSQL connection pool)
Credentials
The following environment variables can be set for configuration:
- `ARROYO__*`: All configuration keys can be set via environment variables with `ARROYO__` prefix (double underscore maps to `.`, single underscore maps to `-`)
- `DATABASE_URL`: PostgreSQL connection string for build-time code generation (optional, defaults to localhost)
Quick Install
# Install system dependencies (Debian/Ubuntu)
apt-get install -y curl pkg-config unzip build-essential libssl-dev openssl cmake clang wget libsasl2-dev
# Install Rust (if not present)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
# Install protoc
PROTOC_VERSION=21.8
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-x86_64.zip
unzip protoc-${PROTOC_VERSION}-linux-x86_64.zip -d /usr/local
# Build Arroyo
cargo build --release
Code Evidence
Workspace configuration from `Cargo.toml:30-50`:
resolver = "2"
[workspace.dependencies]
tonic = { version = "0.13", features = ["zstd", "transport", "tls-ring", "tls-native-roots"] }
arrow = { version = "55.2.0" }
datafusion = { version = "48.0.1" }
parquet = { version = "55.2.0" }
object_store = { version = "0.12.3" }
rustls = "0.23.27"
Configuration loading priority from `config.rs:141-189`:
fn load_config(paths: &[PathBuf]) -> Figment {
// Priority (from highest--overriding--to lowest--overridden) is:
// 1. ARROYO__* environment variables
// 2. The config file specified in <path>
// 3. Any *.toml or *.yaml files specified in <dir>
// 4. arroyo.toml in the current directory
// 5. $(user conf dir)/arroyo/config.{toml,yaml}
// 6. ../default.toml
let mut figment = Figment::from(Toml::string(DEFAULT_CONFIG));
// ...
figment.admerge(
Env::prefixed("ARROYO__")
.map(|p| p.as_str().replace("__", ".").replace("_", "-").into()),
)
}
Custom forks from `Cargo.toml:87-101`:
[patch.crates-io]
parquet = {git = 'https://github.com/ArroyoSystems/arrow-rs', branch = '55.2.0/parquet'}
datafusion = {git = 'https://github.com/ArroyoSystems/arrow-datafusion', branch = '48.0.1/arroyo'}
sqlparser = { git = "https://github.com/ArroyoSystems/sqlparser-rs", branch = "0.55.0/arroyo" }
cornucopia_async = { git = "https://github.com/ArroyoSystems/cornucopia", branch = "sqlite" }
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `error: failed to run custom build command for arroyo-api` | Missing PostgreSQL for cornucopia code generation | Install PostgreSQL or set `DATABASE_URL` environment variable |
| `error: could not compile ... edition2024` | Rust version too old | Update to Rust 1.84+ with `rustup update` |
| `protoc not found` | Missing Protocol Buffers compiler | Install protoc v21.8+ |
| `error: linker cc not found` | Missing C compiler | Install `build-essential` and `clang` |
Compatibility Notes
- Custom Forks: Arroyo uses patched versions of parquet, datafusion, sqlparser, and cornucopia. These patches are maintained on separate branches in the ArroyoSystems GitHub org.
- jemalloc: Used as the global allocator on `x86_64` and `aarch64` Linux (non-MSVC) for better memory performance. Not used on other platforms.
- Windows: Not officially supported. Development should use Linux or macOS.
- macOS: Supported for development; production deployments target Linux.