Heuristic:Vibrantlabsai Ragas Analytics Silent Failure Pattern
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Debugging |
| Last Updated | 2026-02-12 10:00 GMT |
Overview
Design pattern for analytics telemetry: silently swallow all tracking errors in production so they never block evaluation workflows, with opt-in debug logging.
Description
Ragas collects anonymous usage analytics (evaluation counts, LLM provider usage) to understand adoption patterns. The analytics system is designed with a fail-silent principle: any exception during tracking is caught and suppressed so it never interrupts the user's evaluation workflow. A 1-second timeout on HTTP requests ensures analytics never adds noticeable latency. Events are batched (10 per batch, 10-second flush interval) and sent via a daemon thread to avoid blocking.
Usage
Use this heuristic when:
- Debugging analytics issues: Set `__RAGAS_DEBUG_TRACKING=true` to surface tracking errors.
- Running in air-gapped environments: Analytics will silently fail with no impact on evaluation.
- Opting out entirely: Set `RAGAS_DO_NOT_TRACK=true` to disable all telemetry.
- Designing similar telemetry: Follow this pattern of silent failure + opt-in debugging.
The Insight (Rule of Thumb)
- Action: Wrap all analytics code with the `@silent` decorator. Set aggressive timeouts (1s). Use daemon threads for batching.
- Value: 1-second HTTP timeout, batch size 10, flush every 10 seconds, daemon thread for background sending.
- Trade-off: Some analytics events may be lost during network issues or rapid program exit. This is acceptable since analytics is non-critical.
Reasoning
User-facing evaluation workflows must never be delayed or broken by telemetry. A 1-second timeout ensures that even when the analytics endpoint is unreachable, the overhead is negligible. The daemon thread ensures the process can exit cleanly without waiting for analytics to flush. The `@silent` decorator pattern provides a clean separation between "must work" (evaluation) and "nice to have" (analytics) code paths. The legacy analytics endpoint (`explodinggradients.com`) is intentionally preserved for backward compatibility despite a company rename.
Code Evidence
Silent error decorator from `src/ragas/_analytics.py:58-76`:
def silent(func: t.Callable[P, T]) -> t.Callable[P, T]:
# Silent errors when tracking
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
try:
return func(*args, **kwargs)
except Exception as err:
if _usage_event_debugging():
if get_debug_mode():
logger.error("Tracking Error: %s", err, stack_info=True)
raise err
else:
logger.info("Tracking Error: %s", err)
else:
logger.debug("Tracking Error: %s", err)
return None
return wrapper
Timeout and legacy endpoint from `src/ragas/_analytics.py:36-38`:
# NOTE: This URL intentionally remains as explodinggradients.com (legacy analytics endpoint)
USAGE_TRACKING_URL = "https://t.explodinggradients.com"
USAGE_REQUESTS_TIMEOUT_SEC = 1
Opt-out mechanism from `src/ragas/_analytics.py:46-49`:
@lru_cache(maxsize=1)
def do_not_track() -> bool:
return os.environ.get(RAGAS_DO_NOT_TRACK, str(False)).lower() == "true"