Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Truera Trulens Trace Compression Token Limits

From Leeroopedia
Knowledge Sources
Domains Optimization, LLMs
Last Updated 2026-02-14 08:00 GMT

Overview

Trace compression strategy targeting 40,000 tokens with a 500KB max trace size and 20,000 token minimum floor for very large traces, preserving plans and error details.

Description

When TruLens sends trace data to an LLM for evaluation (e.g., agent evaluation metrics like tool selection scoring), the trace may be very large. The trace compression module reduces token usage while preserving essential information like plans, key decisions, and error details. The system uses three thresholds: a 500KB raw size limit triggers compression, a 40,000 token default target that leaves room for model output in the context window, and a 20,000 token minimum floor that prevents over-compression of very large traces. Strings are safely truncated without breaking JSON structure.

Usage

Apply this heuristic when evaluating agent traces that may be large (multi-step LangGraph agents, complex tool-use chains). The defaults are tuned for models with 128K context windows. If using models with smaller context windows (e.g., 32K), reduce the `DEFAULT_TOKEN_LIMIT`. If traces are being over-compressed and losing important information, increase `MAX_TRACE_SIZE_TOKEN_LIMIT`.

The Insight (Rule of Thumb)

  • Action: Use default token limits for standard agent evaluation. Adjust only for non-standard context window sizes.
  • Values:
    • `MAX_TRACE_SIZE` = 500,000 bytes (500KB) — triggers compression
    • `DEFAULT_TOKEN_LIMIT` = 40,000 tokens — default compression target
    • `MAX_TRACE_SIZE_TOKEN_LIMIT` = 20,000 tokens — minimum floor for very large traces
  • Trade-off: Lower token limits reduce cost but may lose important trace details. Higher limits preserve more context but increase evaluation latency and cost.

Reasoning

Modern LLMs have context windows of 128K-200K tokens. The 40,000 token default target uses roughly 30% of a 128K context window, leaving ample room for the system prompt, evaluation prompt, and model response. The 500KB raw size threshold is a practical heuristic — traces below this size are unlikely to benefit from compression. The 20,000 token minimum floor ensures that even very large traces retain enough detail for meaningful evaluation.

The `Selector.MAX_SIZE` of 400KB (400,000 bytes) for individual selector data serves a similar purpose — it ensures that selected span data fits within 100K token context windows.

Code Evidence

Compression constants from `src/core/trulens/core/utils/trace_compression.py:14-16`:

MAX_TRACE_SIZE = 500000  # 500KB
MAX_TRACE_SIZE_TOKEN_LIMIT = 20000  # Minimum token limit for very large traces
DEFAULT_TOKEN_LIMIT = 40000  # Default target, leaves room for model output

Selector size limit from `src/core/trulens/core/feedback/selector.py:175`:

MAX_SIZE = 400000  # 400KB limit for 100k token compatibility

Session batch and concurrency defaults from `src/core/trulens/core/session.py:119-142`:

RETRY_RUNNING_SECONDS: float = 60.0
"""How long to wait (in seconds) before restarting a feedback function that has already started"""

RETRY_FAILED_SECONDS: float = 5 * 60.0
"""How long to wait (in seconds) to retry a failed feedback function run."""

DEFERRED_NUM_RUNS: int = 32
"""Number of futures to wait for when evaluating deferred feedback functions."""

RECORDS_BATCH_TIMEOUT_IN_SEC: int = 10
"""Time to wait before inserting a batch of records into the database."""

GROUND_TRUTHS_BATCH_SIZE: int = 100
"""Time to wait before inserting a batch of ground truths into the database."""

Max serialization size from `src/core/trulens/core/schema/base.py:11`:

MAX_DILL_SIZE: int = 1024 * 1024  # 1MB

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment