Principle:Truera Trulens Dashboard Tab Architecture
| Knowledge Sources | |
|---|---|
| Domains | Dashboard UI, Streamlit Architecture, ML Observability |
| Last Updated | 2026-02-14 08:00 GMT |
Overview
The TruLens dashboard employs a multi-tab Streamlit navigation architecture where each tab (Leaderboard, Records, Compare) is an independent Python module responsible for its own page state initialization, data fetching, grid rendering, and user interaction handling, coordinated through a shared utilities layer.
Description
The Dashboard Tab Architecture is the structural pattern governing how the TruLens interactive evaluation dashboard is organized and rendered. The dashboard is a Streamlit multi-page application that provides three primary views for inspecting LLM application performance:
Leaderboard -- An aggregated view of app versions with feedback scores, cost metrics, latency, and token counts. It supports grid, histogram, and list sub-tabs for different visual representations of the same data.
Records -- A detailed per-record view showing individual invocation inputs, outputs, feedback results, and trace details. Users can select a single record and drill into its feedback pills, trace timeline, and metadata.
Compare -- A side-by-side comparison view for two or more app versions, displaying overlapping records, per-metric differences or standard deviations, and comparative trace exploration.
Each tab module follows a consistent internal structure: a top-level main function that sets page config and renders the sidebar, a page state initializer that reads query parameters into Streamlit session state exactly once, a data preprocessing step that transforms raw database records into display-ready DataFrames, and one or more rendering functions that compose Streamlit or AgGrid widgets.
The architecture solves the problem of managing complex, stateful dashboard interactions across multiple views of the same underlying data (records, feedback results, app metadata) without duplicating data-fetching or configuration logic. Shared concerns -- session management, cached data access, sidebar rendering, version filtering, and query parameter synchronization -- are factored into a centralized dashboard_utils module. UI primitives such as feedback category coloring, AgGrid CSS rules, and metric delta styling live in a styles module, while reusable Streamlit widgets (metadata display, selector buttons) reside in a components module.
In the ML observability landscape, this architecture provides the interactive exploration layer that sits on top of TruLens' evaluation engine and database. It bridges the gap between raw evaluation data and human understanding by offering filterable, sortable, and visually-annotated views of LLM application behavior.
Usage
This architecture is the right pattern when:
- A Streamlit application must present multiple related but distinct views of the same underlying data (e.g., aggregated vs. per-record vs. comparative).
- Each view requires independent page state (selected rows, filter settings, pagination) that persists across Streamlit reruns but does not leak between tabs.
- Deep linking via query parameters is needed so users can bookmark or share specific dashboard states (selected app versions, filtered records, pinned comparisons).
- The application must support pluggable rendering backends -- for example, falling back from AgGrid to native Streamlit dataframes when a third-party package is unavailable or when running in constrained environments (Streamlit in Snowflake).
- Custom pages may be added at runtime through an environment variable, requiring a navigation structure that can dynamically discover and register additional page modules.
Theoretical Basis
The core algorithm for the tab architecture can be expressed as the following abstract lifecycle:
1. Navigation Registration
The entry point registers all tab modules as pages in Streamlit's navigation system. Each tab is a standalone Python file discovered from a known directory. An optional environment variable allows injection of additional custom page files.
2. Per-Tab Initialization (exactly once per session)
Each tab checks a session state flag (page_name.initialized). If not yet initialized, query parameters are deserialized into session state using optional type-transform functions (e.g., comma-separated strings to lists, string booleans to actual booleans). The flag is then set to prevent re-initialization on subsequent reruns.
3. Shared Sidebar Rendering
A common sidebar function retrieves the list of available apps, renders a selector if multiple apps exist, and provides a refresh button that clears all caches and session state. The selected app name is stored both in session state and query parameters for deep linking.
4. Data Fetching with Caching
Cached functions (with a configurable TTL) retrieve records, feedback definitions, and app version metadata from the TruLens database. Caching occurs at the Streamlit st.cache_data level, meaning all tabs sharing the same parameters receive the same cached result without redundant database queries.
5. Data Preprocessing
Each tab applies its own preprocessing pipeline:
- Leaderboard groups records by app version, computes aggregate metrics (mean latency, total cost, mean feedback scores), and joins version metadata.
- Records sorts by timestamp, cleans text encodings (mojibake repair, unicode escape decoding), and optionally filters by a search query.
- Compare merges records across app versions on shared inputs, computes per-metric diffs or standard deviations, and constructs a comparison DataFrame.
6. Rendering with State Feedback
Grid components (AgGrid or native Streamlit dataframe) render the preprocessed data and return user selections. Selection events update session state and query parameters, triggering downstream rendering of detail panels, feedback pills, or trace viewers. Tabs communicate with each other by writing to shared session state keys (e.g., setting Records.app_ids from a Leaderboard button click, then calling st.switch_page).
7. Cross-Tab Navigation
Buttons in one tab can redirect to another by writing the necessary context (selected app IDs, record IDs) into session state and invoking st.switch_page. The target tab reads these values during its initialization phase, ensuring seamless transitions.