Principle:Helicone Helicone Analytics Storage

Knowledge Sources	Helicone
Domains	LLM Observability, Analytics Storage, OLAP Database
Last Updated	2026-02-14 00:00 GMT

Overview

Analytics storage is the practice of persisting processed LLM request-response records into a columnar OLAP database optimized for high-throughput ingestion and fast analytical queries over large time-series datasets.

Description

After a log message has been processed through the handler chain -- authenticated, enriched with token counts, cost, prompt metadata, and custom properties -- it must be durably stored in a system that supports the query patterns required by the observability dashboard: filtering by time range, provider, model, user, custom properties; aggregating costs and token usage; and retrieving individual request-response pairs for inspection.

Traditional row-oriented relational databases struggle with the write throughput and analytical query performance required at scale. Instead, the system uses a column-oriented analytical database that stores records in a table optimized for time-series append patterns. The table engine uses a deduplication strategy based on a version column, allowing in-place updates (e.g., adding a new property or score to an existing record) by inserting a new row with a higher version that supersedes the previous one during query time.

Each record contains the full denormalized context: request and response IDs, timestamps, latency, status, model, provider, token counts (prompt, completion, cache, audio, reasoning), cost, user ID, organization ID, custom properties, scores, body content (or references to body content in object storage), and flags for caching, passthrough billing, and threat detection.

Usage

Use this storage pattern when the observability platform must ingest tens of thousands of records per second while simultaneously serving sub-second analytical queries across billions of rows. This is appropriate when the data model is append-heavy, updates are infrequent and can be modeled as version-based replacements, and the primary query patterns involve filtering and aggregation over time-partitioned data.

Theoretical Basis

The pattern relies on a ReplacingMergeTree storage engine (or equivalent) from column-oriented databases. The key properties are:

Columnar storage: Data is stored by column rather than by row, enabling efficient compression and fast scans over specific dimensions (e.g., summing cost across all requests for a model).
Append-only with deduplication: Inserts are always appends. When multiple rows share the same primary key, a background merge process retains only the row with the highest version (or most recent timestamp).
Asynchronous inserts: Records are buffered and flushed in batches, maximizing write throughput at the cost of slightly delayed visibility.
Partitioning: Data is partitioned by time (typically by day or month), allowing the database to prune irrelevant partitions during time-range queries.

The theoretical insertion flow is:

Construct a fully denormalized record containing all dimensions and metrics.
Issue an asynchronous bulk insert into the analytics table.
The database engine appends the records to the current partition.
Background merge threads consolidate duplicate keys, retaining only the latest version.
Queries read from the merged view, seeing a consistent snapshot.

Related Pages

Implemented By

Implementation:Helicone_Helicone_VersionedRequestStore_InsertRequestResponseVersioned

Uses Heuristic

Heuristic:Helicone_Helicone_ClickHouse_ReplacingMergeTree_FINAL

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment