Environment:Langfuse Langfuse ClickHouse Analytics
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Analytics_Database |
| Last Updated | 2026-02-14 06:00 GMT |
Overview
ClickHouse analytics database (version 24.3 to 25.8) for high-volume trace, observation, score, and event storage with configurable connection pooling, async inserts, and cluster support.
Description
ClickHouse serves as the analytics database for Langfuse, storing all high-volume tracing data including traces, observations (spans/generations), scores, and events. It is accessed via the @clickhouse/client npm package (v1.13.0) with a singleton connection pattern. The system supports read-only replicas, cluster mode, and configurable deletion strategies (alter_update, lightweight_update, lightweight_update_force). Migrations use shell scripts in packages/shared/clickhouse/scripts/.
Usage
Use this environment for all Langfuse deployments. ClickHouse is mandatory for storing and querying trace data, observation metrics, scores, and dashboard analytics. It works alongside PostgreSQL which handles relational/configuration data.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Database | ClickHouse 24.3+ | Development uses 25.8; production tested with 24.3+ |
| Disk | 50GB+ SSD | High IOPS required; scales with trace volume |
| RAM | 4GB+ | ClickHouse is memory-intensive for GROUP BY operations |
| Network | TCP ports 8123 (HTTP), 9000 (native) | HTTP used by Node.js client |
Dependencies
System Packages
clickhouse-server>= 24.3 (via Docker:clickhouse/clickhouse-server)
Node.js Packages
@clickhouse/client= 1.13.0
Credentials
The following environment variables must be set:
CLICKHOUSE_URL: (Required) ClickHouse HTTP endpoint (e.g.,http://localhost:8123)CLICKHOUSE_USER: (Required) ClickHouse auth userCLICKHOUSE_PASSWORD: (Required) ClickHouse auth passwordCLICKHOUSE_DB: Database name (default:default)CLICKHOUSE_CLUSTER_NAME: Cluster name (default:default)CLICKHOUSE_CLUSTER_ENABLED: Enable cluster mode (default:true)CLICKHOUSE_READ_ONLY_URL: Read-only replica URL for legacy tables (optional)CLICKHOUSE_EVENTS_READ_ONLY_URL: Read-only replica URL for events table (optional)
Performance Tuning
CLICKHOUSE_KEEP_ALIVE_IDLE_SOCKET_TTL: Idle socket timeout in ms (default: 9000)CLICKHOUSE_MAX_OPEN_CONNECTIONS: Connection pool size (default: 25)CLICKHOUSE_MAX_BYTES_BEFORE_EXTERNAL_GROUP_BY: Memory limit for GROUP BY (default: 32,000,000,000 bytes / ~32GB)CLICKHOUSE_ASYNC_INSERT_MAX_DATA_SIZE: Max data size for async insert (optional)CLICKHOUSE_ASYNC_INSERT_BUSY_TIMEOUT_MS: Timeout for async insert busy (optional)CLICKHOUSE_LIGHTWEIGHT_DELETE_MODE: Deletion strategy (default:alter_update)
Quick Install
# Start ClickHouse via Docker Compose
pnpm run infra:dev:up
# Apply all pending ClickHouse migrations
cd packages/shared
bash clickhouse/scripts/up.sh
# (Optional) Create dev-only experimental tables
bash clickhouse/scripts/dev-tables.sh
Code Evidence
ClickHouse client initialization from packages/shared/src/server/clickhouse/client.ts:
const client = createClient({
keep_alive: {
idle_socket_ttl: env.CLICKHOUSE_KEEP_ALIVE_IDLE_SOCKET_TTL, // Default: 9000ms
},
max_open_connections: env.CLICKHOUSE_MAX_OPEN_CONNECTIONS, // Default: 25
clickhouse_settings: {
async_insert: 1,
wait_for_async_insert: 1,
},
});
Deletion timeout from packages/shared/src/env.ts:
LANGFUSE_CLICKHOUSE_DELETION_TIMEOUT_MS: z.coerce.number().default(600_000), // 10 minutes
LANGFUSE_CLICKHOUSE_QUERY_MAX_ATTEMPTS: z.coerce.number().default(3),
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
Connection refused on port 8123 |
ClickHouse not running | Run pnpm run infra:dev:up
|
MEMORY_LIMIT_EXCEEDED |
Query exceeds memory limit | Increase CLICKHOUSE_MAX_BYTES_BEFORE_EXTERNAL_GROUP_BY or optimize query
|
Socket hang up |
Connection timeout on long queries | Retried automatically up to LANGFUSE_CLICKHOUSE_QUERY_MAX_ATTEMPTS (default: 3)
|
READONLY |
Connected to read-only replica for write | Check CLICKHOUSE_URL points to primary node
|
Compatibility Notes
- Cluster Mode: Controlled by
CLICKHOUSE_CLUSTER_ENABLED. When enabled, DDL and data operations target the cluster. - Read Replicas: Separate URLs for read-only access to legacy tables (
CLICKHOUSE_READ_ONLY_URL) and events table (CLICKHOUSE_EVENTS_READ_ONLY_URL). - Deletion Strategies: Three modes available via
CLICKHOUSE_LIGHTWEIGHT_DELETE_MODE:alter_update(default, safest),lightweight_update, andlightweight_update_force. - Long Queries: Queries exceeding 30 seconds automatically enable HTTP progress headers at 10-second intervals.