Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Wandb Weave Trace Server Infrastructure

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Backend, Database
Last Updated 2026-02-14 12:00 GMT

Overview

Server-side infrastructure environment with ClickHouse, Kafka, and cloud storage (S3/Azure/GCP) for the Weave trace server.

Description

This environment defines the infrastructure requirements for running the Weave trace server component. It requires ClickHouse as the primary data store, optional Kafka for event streaming and online evaluation, and supports Bring-Your-Own-Bucket (BYOB) file storage via AWS S3, Azure Blob Storage, or Google Cloud Storage. The trace server also includes ddtrace for APM, litellm for LLM scoring support, and OpenTelemetry libraries for trace ingestion.

Usage

Use this environment when self-hosting the Weave trace server or when deploying the server-side components. This is not required for SDK-only usage that connects to the hosted Wandb cloud service. It is the prerequisite for the build and publish implementations in the SDK Release workflow.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu recommended) Production server deployment
ClickHouse ClickHouse server Default: `localhost:8123`, configurable via env vars
Kafka Apache Kafka (optional) Required only for online evaluation and event streaming
Network Accessible storage endpoints S3, Azure Blob, or GCS for BYOB file storage
Disk SSD recommended High IOPS for ClickHouse and file caching

Dependencies

System Packages

  • ClickHouse server
  • Apache Kafka (optional)

Python Packages

  • `ddtrace` >= 2.7.0
  • `boto3` >= 1.34.0 (BYOB S3)
  • `azure-storage-blob` >= 12.24.0, < 12.26.0 (BYOB Azure)
  • `google-cloud-storage` >= 2.7.0 (BYOB GCP)
  • `litellm` >= 1.36.1 (LLM scoring support)
  • `opentelemetry-proto` >= 1.12.0
  • `opentelemetry-semantic-conventions-ai` >= 0.4.3
  • `openinference-semantic-conventions` >= 0.1.17
  • `emoji` >= 2.12.1

Credentials

ClickHouse

  • `WF_CLICKHOUSE_HOST`: ClickHouse hostname (default: `localhost`)
  • `WF_CLICKHOUSE_PORT`: ClickHouse port (default: `8123`)
  • `WF_CLICKHOUSE_USER`: ClickHouse username (default: `default`)
  • `WF_CLICKHOUSE_PASS`: ClickHouse password (default: empty)
  • `WF_CLICKHOUSE_DATABASE`: Database name (default: `default`)

Kafka

  • `KAFKA_BROKER_HOST`: Kafka broker hostname (default: `localhost`)
  • `KAFKA_BROKER_PORT`: Kafka broker port (default: `9092`)
  • `KAFKA_CLIENT_USER`: Kafka authentication username (optional)
  • `KAFKA_CLIENT_PASSWORD`: Kafka authentication password (optional)

AWS S3 (BYOB)

  • `WF_FILE_STORAGE_URI`: S3 bucket URI
  • `WF_FILE_STORAGE_AWS_ACCESS_KEY_ID`: AWS access key
  • `WF_FILE_STORAGE_AWS_SECRET_ACCESS_KEY`: AWS secret key
  • `WF_FILE_STORAGE_AWS_SESSION_TOKEN`: AWS session token (optional)
  • `WF_FILE_STORAGE_AWS_KMS_KEY`: KMS encryption key (optional)
  • `WF_FILE_STORAGE_AWS_REGION`: AWS region

Azure Blob (BYOB)

  • `WF_FILE_STORAGE_AZURE_CONNECTION_STRING`: Azure connection string
  • `WF_FILE_STORAGE_AZURE_ACCESS_KEY`: Azure access key
  • `WF_FILE_STORAGE_AZURE_ACCOUNT_URL`: Azure account URL

GCP (BYOB)

  • `WF_FILE_STORAGE_GCP_CREDENTIALS_JSON_B64`: Base64-encoded GCP credentials JSON

Feature Flags

  • `WEAVE_ENABLE_ONLINE_EVAL`: Enable online evaluation worker (default: `false`)
  • `WF_SCORING_WORKER_BATCH_SIZE`: Scoring worker batch size (default: `100`)
  • `WF_SCORING_WORKER_BATCH_TIMEOUT`: Scoring worker batch timeout in seconds (default: `5`)
  • `WF_FILE_STORAGE_PROJECT_ALLOW_LIST`: Comma-separated project IDs for BYOB
  • `WF_FILE_STORAGE_PROJECT_RAMP_PCT`: BYOB rollout percentage (0-100)

Quick Install

# Install Weave with trace server dependencies
pip install "weave[trace_server]"

Code Evidence

ClickHouse configuration from `weave/trace_server/environment.py`:

wf_clickhouse_host: str = "localhost"
wf_clickhouse_port: int = 8123
wf_clickhouse_user: str = "default"
wf_clickhouse_pass: str = ""
wf_clickhouse_database: str = "default"

Kafka configuration from `weave/trace_server/environment.py`:

kafka_broker_host: str = "localhost"
kafka_broker_port: int = 9092

BYOB storage configuration from `weave/trace_server/environment.py`:

wf_file_storage_uri: str | None = None  # S3/Azure/GCP URI
wf_file_storage_project_allow_list: list[str] = []
wf_file_storage_project_ramp_pct: int = 0

Trace server dependencies from `pyproject.toml:71-87`:

trace_server = [
  "ddtrace>=2.7.0",
  "boto3>=1.34.0",
  "azure-storage-blob>=12.24.0,<12.26.0",
  "google-cloud-storage>=2.7.0",
  "litellm>=1.36.1",
  "opentelemetry-proto>=1.12.0",
  "opentelemetry-semantic-conventions-ai>=0.4.3",
  "openinference-semantic-conventions>=0.1.17",
  "emoji>=2.12.1",
]

Common Errors

Error Message Cause Solution
ClickHouse connection refused ClickHouse not running or wrong host/port Verify `WF_CLICKHOUSE_HOST` and `WF_CLICKHOUSE_PORT`
Kafka broker unavailable Kafka not running or wrong broker config Verify `KAFKA_BROKER_HOST` and `KAFKA_BROKER_PORT`
S3 access denied Invalid AWS credentials Verify `WF_FILE_STORAGE_AWS_*` environment variables
Azure authentication error Invalid connection string or access key Verify `WF_FILE_STORAGE_AZURE_*` environment variables

Compatibility Notes

  • ClickHouse Replication: Set `WF_CLICKHOUSE_REPLICATED=true` for replicated setups; requires `WF_CLICKHOUSE_REPLICATED_PATH` and `WF_CLICKHOUSE_REPLICATED_CLUSTER`.
  • Distributed Tables: Set `WF_CLICKHOUSE_USE_DISTRIBUTED_TABLES=true` for sharded ClickHouse clusters.
  • Memory Limits: Use `WF_CLICKHOUSE_MAX_MEMORY_USAGE` and `WF_CLICKHOUSE_MAX_EXECUTION_TIME` to constrain resource usage.
  • BYOB Rollout: Use `WF_FILE_STORAGE_PROJECT_RAMP_PCT` (0-100) for gradual rollout of BYOB storage.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment