Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Treeverse LakeFS LakeFS Server Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Data_Platform
Last Updated 2026-02-08 10:00 GMT

Overview

Runtime environment for the lakeFS server requiring a database backend (PostgreSQL, DynamoDB, or local Pebble), object storage credentials, and an authentication secret key.

Description

The lakeFS server runs as a single binary on Alpine Linux 3.21 (Docker) or any supported OS. It requires a metadata database (PostgreSQL 11+, DynamoDB, or embedded local Pebble), a blockstore backend (S3, GCS, Azure Blob, or local filesystem), and authentication configuration. The server listens on port 8000 by default and exposes both a REST API and an S3-compatible gateway.

Usage

Use this environment for deploying the lakeFS server, whether in Docker, Kubernetes, or bare-metal. It is the mandatory runtime prerequisite for all API-based implementations including repository creation, branching, committing, merging, importing, and garbage collection.

System Requirements

Category Requirement Notes
OS Alpine 3.21 (Docker) or Linux/macOS Production deployments typically use Docker
Hardware 2+ CPU cores, 2GB+ RAM More for production workloads
Disk 1GB+ SSD For local cache (default 1GB committed cache)
Network Port 8000 (API), S3 gateway port Configurable via `listen_address`

Dependencies

Database Backend (one required)

  • PostgreSQL >= 11 (default: max 25 open connections, 5m connection lifetime)
  • DynamoDB (table name: `kvstore`, scan limit: 1024, max attempts: 10)
  • Local Pebble (embedded; path: `~/lakefs/metadata`, prefetch: 256)

Object Storage Backend (one required)

  • AWS S3 (region: us-east-1 default, max retries: 5, bucket region discovery enabled)
  • Google Cloud Storage (S3-compatible endpoint: `https://storage.googleapis.com`)
  • Azure Blob Storage (try timeout: 10 minutes)
  • Local filesystem (path: `~/lakefs/data/block`)

Credentials

The following environment variables or configuration keys must be set:

  • `LAKEFS_AUTH_ENCRYPT_SECRET_KEY`: CRITICAL encryption secret for auth tokens. Default is `THIS_MUST_BE_CHANGED_IN_PRODUCTION`.
  • `LAKEFS_BLOCKSTORE_SIGNING_SECRET_KEY`: Signing key for blockstore operations. Default is `OVERRIDE_THIS_SIGNING_SECRET_DEFAULT`.
  • `LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING`: PostgreSQL connection string (if using PostgreSQL).
  • `LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_KEY_ID`: AWS access key (if using S3).
  • `LAKEFS_BLOCKSTORE_S3_CREDENTIALS_SECRET_ACCESS_KEY`: AWS secret key (if using S3).
  • `LAKEFS_BLOCKSTORE_GS_CREDENTIALS_JSON`: GCS credentials JSON (if using GCS).
  • `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_TENANT_ID`: Azure credentials (if using Azure Blob).

Quick Install

# Docker quickstart (local mode, no external database needed)
docker run --pull always -p 8000:8000 treeverse/lakefs run --quickstart

# Docker with PostgreSQL
docker run -p 8000:8000 \
  -e LAKEFS_DATABASE_TYPE=postgres \
  -e LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING="postgres://user:pass@host/db?sslmode=disable" \
  -e LAKEFS_AUTH_ENCRYPT_SECRET_KEY="your-secret-key-here" \
  -e LAKEFS_BLOCKSTORE_TYPE=s3 \
  treeverse/lakefs run

Code Evidence

Default configuration constants from `pkg/config/defaults.go:9-38`:

const (
    DefaultListenAddress        = "0.0.0.0:8000"
    DefaultAuthAPIHealthCheckTimeout = 20 * time.Second
    DefaultAuthSecret                = "THIS_MUST_BE_CHANGED_IN_PRODUCTION"
    DefaultSigningSecretKey          = "OVERRIDE_THIS_SIGNING_SECRET_DEFAULT"
    DefaultBlockstoreS3Region                         = "us-east-1"
    DefaultBlockstoreS3MaxRetries                     = 5
    DefaultBlockstoreS3PreSignedExpiry                = 15 * time.Minute
    DefaultBlockstoreAzureTryTimeout                  = 10 * time.Minute
)

Database defaults from `pkg/config/defaults.go:140-152`:

viper.SetDefault("database.local.path", "~/lakefs/metadata")
viper.SetDefault("database.local.prefetch_size", 256)
viper.SetDefault("database.local.sync_writes", true)

viper.SetDefault("database.dynamodb.table_name", "kvstore")
viper.SetDefault("database.dynamodb.scan_limit", 1024)
viper.SetDefault("database.dynamodb.max_attempts", 10)

viper.SetDefault("database.postgres.max_open_connections", 25)
viper.SetDefault("database.postgres.max_idle_connections", 25)
viper.SetDefault("database.postgres.connection_max_lifetime", "5m")

Cache configuration from `pkg/config/defaults.go:88-100`:

viper.SetDefault("committed.local_cache.size_bytes", 1*1024*1024*1024)  // 1 GB
viper.SetDefault("committed.local_cache.dir", "~/lakefs/data/cache")
viper.SetDefault("committed.local_cache.range_proportion", 0.9)
viper.SetDefault("committed.local_cache.metarange_proportion", 0.1)
viper.SetDefault("committed.sstable.memory.cache_size_bytes", 400_000_000)  // 400 MB

Common Errors

Error Message Cause Solution
Auth health check timeout Remote auth API unreachable Check `auth.api.health_check_timeout` (default 20s) and network connectivity
`THIS_MUST_BE_CHANGED_IN_PRODUCTION` warning Default auth secret used Set `LAKEFS_AUTH_ENCRYPT_SECRET_KEY` to a unique secret
Connection refused on port 8000 Server not started or port conflict Verify `listen_address` configuration and port availability
DynamoDB throttling Exceeded provisioned capacity Increase DynamoDB `max_attempts` (default 10) or provision more capacity

Compatibility Notes

  • Quickstart mode: Uses embedded local database and local blockstore. Not suitable for production.
  • PostgreSQL 11: Minimum version used in Docker Compose test environments. Production should use PostgreSQL 13+.
  • Azure operations: Have a 10-minute try timeout (much larger than S3/GCS defaults) due to slower Azure operations.
  • Presigned URLs: Disabled in UI by default across all providers (`disable_pre_signed_ui = true`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment