Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:ArroyoSystems Arroyo PostgreSQL Database

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Database
Last Updated 2026-02-08 08:00 GMT

Overview

PostgreSQL 14+ database for Arroyo metadata storage in distributed deployment mode, with SQLite as the default for local/development mode.

Description

Arroyo uses a relational database for persisting pipeline definitions, connection profiles, connection tables, job state, UDF metadata, and checkpoint records. In distributed deployment mode, PostgreSQL is the recommended database backend, accessed via `deadpool-postgres` connection pooling and `cornucopia` compile-time checked SQL. For local development and single-node deployments, SQLite is the default backend (configured via `database.type = "sqlite"`). The database schema is managed through versioned migrations using the Refinery migration framework.

Usage

Use PostgreSQL when running Arroyo in distributed mode with multiple services (API, Controller, Compiler as separate processes). SQLite is the default for local mode (`arroyo run`) and development. The database stores all persistent state except checkpoint data (which goes to object storage).

System Requirements

Category Requirement Notes
Database PostgreSQL 14+ Required for distributed mode
Alternative SQLite 3.x Default for local/dev mode (via rusqlite 0.31)
Network TCP port 5432 (default) Configurable via `database.postgres.port`
Disk 1GB+ For metadata, migrations, and indexes

Dependencies

System Packages

  • `postgresql` (server, for production)
  • `postgresql-client` (for build-time code generation)
  • `libpq-dev` (PostgreSQL client library)

Rust Crate Dependencies

  • `deadpool-postgres` = 0.14 (connection pooling)
  • `cornucopia_async` = 0.6.0 (compile-time SQL, Arroyo fork)
  • `rusqlite` = 0.31 (SQLite alternative)
  • `refinery` = 0.8.14 (database migrations)

Credentials

The following configuration keys (settable via environment variables) control database access:

  • `ARROYO__DATABASE__TYPE`: Database backend selection (`postgres` or `sqlite`)
  • `ARROYO__DATABASE__POSTGRES__HOST`: PostgreSQL host (default: `localhost`)
  • `ARROYO__DATABASE__POSTGRES__PORT`: PostgreSQL port (default: `5432`)
  • `ARROYO__DATABASE__POSTGRES__USER`: PostgreSQL user (default: `arroyo`)
  • `ARROYO__DATABASE__POSTGRES__PASSWORD`: PostgreSQL password (default: `arroyo`)
  • `ARROYO__DATABASE__POSTGRES__DATABASE_NAME`: Database name (default: `arroyo`)
  • `DATABASE_URL`: Build-time PostgreSQL connection for cornucopia code generation

Legacy environment variables (deprecated, to be removed in 0.12):

  • `DATABASE_HOST`, `DATABASE_PORT`, `DATABASE_USER`, `DATABASE_PASSWORD`, `DATABASE_NAME`

Quick Install

# Install PostgreSQL (Debian/Ubuntu)
apt-get install -y postgresql postgresql-client

# Create database and user
sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo';"
sudo -u postgres psql -c "CREATE DATABASE arroyo OWNER arroyo;"

# Or use SQLite (default, no setup needed)
# Arroyo will auto-create config.sqlite in the config directory

Code Evidence

Database configuration types from `config.rs:539-579`:

pub enum DatabaseType {
    Postgres,
    Sqlite,
}

pub struct DatabaseConfig {
    pub r#type: DatabaseType,
    pub postgres: PostgresConfig,
    pub sqlite: SqliteConfig,
}

pub struct PostgresConfig {
    pub database_name: String,
    pub host: String,
    pub port: u16,
    pub user: String,
    pub password: Sensitive<String>,
}

SQLite default path from `config.rs:571-579`:

impl Default for SqliteConfig {
    fn default() -> Self {
        Self {
            path: dirs::config_dir()
                .map(|p| p.join("arroyo/config.sqlite"))
                .unwrap_or_else(|| PathBuf::from_str("config.sqlite").unwrap()),
        }
    }
}

Default configuration from `default.toml:79-87`:

[database]
type = "sqlite"

[database.postgres]
database-name = "arroyo"
host = "localhost"
port = 5432
user = "arroyo"
password = "arroyo"

Common Errors

Error Message Cause Solution
`connection refused (os error 111)` PostgreSQL not running Start PostgreSQL: `systemctl start postgresql`
`password authentication failed` Wrong credentials Verify `ARROYO__DATABASE__POSTGRES__PASSWORD`
`database "arroyo" does not exist` Database not created Run `CREATE DATABASE arroyo;`
`FATAL: role "arroyo" does not exist` User not created Run `CREATE USER arroyo WITH PASSWORD 'arroyo';`

Compatibility Notes

  • SQLite vs PostgreSQL: SQLite is the default for local mode. Switch to PostgreSQL for distributed deployments with multiple controller/API replicas.
  • Build-time dependency: A running PostgreSQL is needed at build time for cornucopia SQL code generation (can use `DATABASE_URL` env var).
  • Migration compatibility: Separate migration directories exist for PostgreSQL (`migrations/`) and SQLite (`sqlite_migrations/`).
  • Legacy env vars: `DATABASE_HOST`, `DATABASE_PORT`, etc. are deprecated in favor of `ARROYO__DATABASE__POSTGRES__*` and will be removed in version 0.12.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment