Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Wandb Weave Payload Size Limits

From Leeroopedia
Knowledge Sources
Domains Optimization, Performance, Data_Management
Last Updated 2026-02-14 12:00 GMT

Overview

Size limits and chunking heuristics for trace payloads and table uploads to prevent server rejection and optimize transfer.

Description

Weave enforces several payload size limits to prevent server-side rejection (HTTP 413) and optimize data transfer. The remote server has a hard limit of 32 MiB per request; the client uses a 31 MiB soft limit (1 MiB buffer). For large tables, an automatic chunking system splits data based on row count thresholds and estimated byte size. Individual trace payloads (call inputs/outputs) trigger a warning at 3.5 MiB. String representations are truncated at 1,000 characters to prevent payload bloat.

Usage

Use this heuristic when you encounter HTTP 413 errors, large payload warnings, or need to understand why table uploads are being chunked. Also relevant when tracing functions with very large inputs/outputs (e.g., large documents, images, model weights).

The Insight (Rule of Thumb)

  • Action: Keep individual trace payloads (inputs + outputs) under 3.5 MiB per call.
  • Value: `MAX_TRACE_PAYLOAD_SIZE = 3.5 MiB`. Warning logged when exceeded; correlated with ClickHouse single-row insert limits.
  • Trade-off: Large payloads slow tracing and may hit server limits.
  • Action: For table operations, let the auto-chunking handle splitting. Tables > 1,000 rows are automatically evaluated for chunking.
  • Value: `ROW_COUNT_CHUNKING_THRESHOLD = 1000`. Below this, the system estimates `sample_row_size * num_rows * 2`; chunks only if estimated size > 31 MiB.
  • Trade-off: Chunking adds HTTP overhead but prevents 413 errors.
  • Action: Use parallel table upload (default enabled) for large tables.
  • Value: `WEAVE_USE_PARALLEL_TABLE_UPLOAD=true` (default). Tests endpoint availability before attempting parallel upload.
  • Trade-off: Parallel upload is faster but requires server support; falls back to incremental if unavailable.
  • Action: String representations in serialized payloads are truncated at 1,000 characters.
  • Value: `MAX_STR_LEN = 1000` in `weave/trace/serialization/serialize.py`.
  • Trade-off: Prevents bloated payloads from large `__repr__` strings but may lose debug information.

Reasoning

The 31 MiB client-side limit is set as `(32 - 1) * 1024 * 1024` — the real server limit minus a 1 MiB safety buffer for HTTP headers and serialization overhead. The table chunking heuristic uses a two-phase approach: first checking row count (cheap), then estimating byte size from a sample row (moderately cheap), to avoid the expense of serializing the entire table just to check its size. The `* 2` multiplier in the byte estimation accounts for JSON serialization overhead.

When HTTP 413 is returned, the client uses recursive binary splitting: the batch is split in half and each half is retried independently. This continues recursively until all sub-batches are within limits.

Code Evidence

Remote request bytes limit from `weave/trace_server_bindings/http_utils.py:20-22`:

# Default remote request bytes limit (32 MiB real limit - 1 MiB buffer)
REMOTE_REQUEST_BYTES_LIMIT = (32 - 1) * 1024 * 1024
ROW_COUNT_CHUNKING_THRESHOLD = 1000

Trace payload size limit from `weave/trace/weave_client.py:289-291`:

BACKGROUND_PARALLELISM_MIX = 0.5
# This size is correlated with the maximum single row insert size
# in clickhouse, which is currently unavoidable.
MAX_TRACE_PAYLOAD_SIZE = int(3.5 * 1024 * 1024)  # 3.5 MiB

String truncation limit from `weave/trace/serialization/serialize.py:150`:

MAX_STR_LEN = 1000

Table chunking heuristic from `weave/trace/weave_client.py` (row count and byte estimation logic):

# Primary heuristic: Row count > ROW_COUNT_CHUNKING_THRESHOLD triggers chunking
# Secondary heuristic: estimates total bytes as sample_row_size * num_rows * 2

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment