Environment:ArroyoSystems Arroyo PostgreSQL Database
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Database |
| Last Updated | 2026-02-08 08:00 GMT |
Overview
PostgreSQL 14+ database for Arroyo metadata storage in distributed deployment mode, with SQLite as the default for local/development mode.
Description
Arroyo uses a relational database for persisting pipeline definitions, connection profiles, connection tables, job state, UDF metadata, and checkpoint records. In distributed deployment mode, PostgreSQL is the recommended database backend, accessed via `deadpool-postgres` connection pooling and `cornucopia` compile-time checked SQL. For local development and single-node deployments, SQLite is the default backend (configured via `database.type = "sqlite"`). The database schema is managed through versioned migrations using the Refinery migration framework.
Usage
Use PostgreSQL when running Arroyo in distributed mode with multiple services (API, Controller, Compiler as separate processes). SQLite is the default for local mode (`arroyo run`) and development. The database stores all persistent state except checkpoint data (which goes to object storage).
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Database | PostgreSQL 14+ | Required for distributed mode |
| Alternative | SQLite 3.x | Default for local/dev mode (via rusqlite 0.31) |
| Network | TCP port 5432 (default) | Configurable via `database.postgres.port` |
| Disk | 1GB+ | For metadata, migrations, and indexes |
Dependencies
System Packages
- `postgresql` (server, for production)
- `postgresql-client` (for build-time code generation)
- `libpq-dev` (PostgreSQL client library)
Rust Crate Dependencies
- `deadpool-postgres` = 0.14 (connection pooling)
- `cornucopia_async` = 0.6.0 (compile-time SQL, Arroyo fork)
- `rusqlite` = 0.31 (SQLite alternative)
- `refinery` = 0.8.14 (database migrations)
Credentials
The following configuration keys (settable via environment variables) control database access:
- `ARROYO__DATABASE__TYPE`: Database backend selection (`postgres` or `sqlite`)
- `ARROYO__DATABASE__POSTGRES__HOST`: PostgreSQL host (default: `localhost`)
- `ARROYO__DATABASE__POSTGRES__PORT`: PostgreSQL port (default: `5432`)
- `ARROYO__DATABASE__POSTGRES__USER`: PostgreSQL user (default: `arroyo`)
- `ARROYO__DATABASE__POSTGRES__PASSWORD`: PostgreSQL password (default: `arroyo`)
- `ARROYO__DATABASE__POSTGRES__DATABASE_NAME`: Database name (default: `arroyo`)
- `DATABASE_URL`: Build-time PostgreSQL connection for cornucopia code generation
Legacy environment variables (deprecated, to be removed in 0.12):
- `DATABASE_HOST`, `DATABASE_PORT`, `DATABASE_USER`, `DATABASE_PASSWORD`, `DATABASE_NAME`
Quick Install
# Install PostgreSQL (Debian/Ubuntu)
apt-get install -y postgresql postgresql-client
# Create database and user
sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo';"
sudo -u postgres psql -c "CREATE DATABASE arroyo OWNER arroyo;"
# Or use SQLite (default, no setup needed)
# Arroyo will auto-create config.sqlite in the config directory
Code Evidence
Database configuration types from `config.rs:539-579`:
pub enum DatabaseType {
Postgres,
Sqlite,
}
pub struct DatabaseConfig {
pub r#type: DatabaseType,
pub postgres: PostgresConfig,
pub sqlite: SqliteConfig,
}
pub struct PostgresConfig {
pub database_name: String,
pub host: String,
pub port: u16,
pub user: String,
pub password: Sensitive<String>,
}
SQLite default path from `config.rs:571-579`:
impl Default for SqliteConfig {
fn default() -> Self {
Self {
path: dirs::config_dir()
.map(|p| p.join("arroyo/config.sqlite"))
.unwrap_or_else(|| PathBuf::from_str("config.sqlite").unwrap()),
}
}
}
Default configuration from `default.toml:79-87`:
[database]
type = "sqlite"
[database.postgres]
database-name = "arroyo"
host = "localhost"
port = 5432
user = "arroyo"
password = "arroyo"
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `connection refused (os error 111)` | PostgreSQL not running | Start PostgreSQL: `systemctl start postgresql` |
| `password authentication failed` | Wrong credentials | Verify `ARROYO__DATABASE__POSTGRES__PASSWORD` |
| `database "arroyo" does not exist` | Database not created | Run `CREATE DATABASE arroyo;` |
| `FATAL: role "arroyo" does not exist` | User not created | Run `CREATE USER arroyo WITH PASSWORD 'arroyo';` |
Compatibility Notes
- SQLite vs PostgreSQL: SQLite is the default for local mode. Switch to PostgreSQL for distributed deployments with multiple controller/API replicas.
- Build-time dependency: A running PostgreSQL is needed at build time for cornucopia SQL code generation (can use `DATABASE_URL` env var).
- Migration compatibility: Separate migration directories exist for PostgreSQL (`migrations/`) and SQLite (`sqlite_migrations/`).
- Legacy env vars: `DATABASE_HOST`, `DATABASE_PORT`, etc. are deprecated in favor of `ARROYO__DATABASE__POSTGRES__*` and will be removed in version 0.12.