Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Truera Trulens Leaderboard Tab

From Leeroopedia
Knowledge Sources
Domains Dashboard, Visualization
Last Updated 2026-02-14 08:00 GMT

Overview

Streamlit dashboard tab that displays a leaderboard of app versions with aggregated performance metrics, feedback scores, cost tracking, and interactive controls for pinning, comparing, metadata editing, and adding virtual apps.

Description

The Leaderboard Tab module (Leaderboard.py) is the primary landing view of the TruLens dashboard. It aggregates record-level data across app versions and presents three sub-tabs:

  • App Versions (Grid Tab) -- An interactive AgGrid (or native Streamlit dataframe fallback) showing each app version as a row with columns for record count, average latency, total cost (USD and/or Snowflake Credits), mean feedback scores, and user-defined metadata. The grid supports multi-row selection and action buttons:
    • Pin / Unpin App -- toggles a pinned flag on selected app versions for quick filtering.
    • Examine Records -- navigates to the Records tab with the selected app IDs pre-filtered.
    • Compare -- navigates to the Compare tab with 2--5 selected app versions.
    • Add/Edit Metadata -- opens a dialog to set arbitrary key-value metadata on selected app versions.
    • Add Virtual App -- opens a dialog to create a synthetic baseline app version with user-specified feedback scores (only available in non-OTEL mode).
  • Metric Histograms (Plot Tab) -- A grid of Plotly histograms (one per feedback metric) showing score distributions across all records, with automatic layout into 4 columns.
  • List View (List Tab) -- A card-style view where each app version is rendered with Streamlit metrics showing record count, latency, total tokens, cost, and color-coded feedback scores with category icons.

The module handles data preprocessing by grouping records per app version, computing aggregations (count, mean, sum), joining with app-version metadata, and dynamically showing or hiding cost columns based on whether any nonzero values exist.

Usage

Use the Leaderboard Tab as the entry point to the TruLens dashboard for an overview of how all app versions are performing. It is the recommended starting point for identifying regressions, selecting versions to drill into via the Records tab, or setting up comparisons via the Compare tab. The pinning feature allows users to bookmark important baseline versions.

Code Reference

Source Location

Signature

APP_COLS = ["app_version", "app_id", "app_name"]
APP_AGG_COLS = [
    "Records",
    "Average Latency (s)",
    "Total Cost (USD)",
    "Total Cost (Snowflake Credits)",
]

def _get_nonzero_cost_columns(df: pd.DataFrame) -> List[str]: ...

def init_page_state() -> None: ...

def _preprocess_df(
    records_df: pd.DataFrame,
    app_versions_df: pd.DataFrame,
    feedback_col_names: List[str],
    metadata_col_names: List[str],
    show_all: bool = False,
) -> pd.DataFrame: ...

def order_columns(
    df: pd.DataFrame,
    order: Sequence[str],
) -> pd.DataFrame: ...

def _build_grid_options(
    df: pd.DataFrame,
    feedback_col_names: List[str],
    feedback_directions: Dict[str, bool],
    version_metadata_col_names: Sequence[str],
) -> dict: ...

def _render_grid(
    df: pd.DataFrame,
    feedback_col_names: List[str],
    feedback_directions: Dict[str, bool],
    version_metadata_col_names: List[str],
    grid_key: Optional[str] = None,
) -> pd.DataFrame: ...

def handle_pin_toggle(
    selected_app_ids: List[str],
    on_leaderboard: bool,
) -> None: ...

def handle_table_edit(
    df: pd.DataFrame,
    event_data: Dict[str, Any],
    version_metadata_col_names: List[str],
) -> None: ...

@streamlit_compat.st_dialog("Add/Edit Metadata")
def handle_add_metadata(
    selected_rows: pd.DataFrame,
    metadata_col_names: List[str],
) -> None: ...

@streamlit_compat.st_dialog("Add Virtual App")
def handle_add_virtual_app(
    app_name: str,
    feedback_col_names: List[str],
    feedback_defs: Any,
    metadata_col_names: List[str],
) -> None: ...

def _render_grid_tab(
    df: pd.DataFrame,
    feedback_col_names: List[str],
    feedback_defs: Any,
    feedback_directions: Dict[str, bool],
    version_metadata_col_names: List[str],
    app_name: str,
    grid_key: Optional[str] = None,
) -> None: ...

@streamlit_compat.st_fragment
def _render_list_tab(
    df: pd.DataFrame,
    feedback_col_names: List[str],
    feedback_directions: Dict[str, bool],
    version_metadata_col_names: List[str],
    max_feedback_cols: int = 6,
) -> None: ...

@streamlit_compat.st_fragment
def _render_plot_tab(
    df: pd.DataFrame,
    feedback_col_names: List[str],
) -> None: ...

def render_leaderboard(app_name: str) -> None: ...

def leaderboard_main() -> None: ...

Import

from trulens.dashboard.tabs.Leaderboard import render_leaderboard
from trulens.dashboard.tabs.Leaderboard import leaderboard_main
from trulens.dashboard.tabs.Leaderboard import init_page_state

I/O Contract

Inputs

init_page_state

Name Type Required Description
(none) -- -- Reads LEADERBOARD_PAGE_NAME.only_show_pinned and LEADERBOARD_PAGE_NAME.metadata_cols from query parameters and session state.

_get_nonzero_cost_columns

Name Type Required Description
df pd.DataFrame yes DataFrame containing "Total Cost (USD)" and/or "Total Cost (Snowflake Credits)" columns.

_preprocess_df

Name Type Required Description
records_df pd.DataFrame yes Raw records DataFrame with columns including record_id, latency, total_cost, cost_currency, total_tokens, tags, and feedback columns.
app_versions_df pd.DataFrame yes App versions DataFrame with app_id, app_name, app_version, and metadata columns.
feedback_col_names List[str] yes List of feedback column names to aggregate (mean).
metadata_col_names List[str] yes List of metadata column names to join from the app versions table.
show_all bool no If True, uses a right join to include versions with no records. Defaults to False.

render_leaderboard

Name Type Required Description
app_name str yes The name of the application to render the leaderboard for.

handle_pin_toggle

Name Type Required Description
selected_app_ids List[str] yes List of app IDs whose pinned status should be toggled.
on_leaderboard bool yes Current pinned state; True means the apps are currently pinned (will be unpinned).

handle_table_edit

Name Type Required Description
df pd.DataFrame yes Current leaderboard DataFrame.
event_data Dict[str, Any] yes AgGrid cellValueChanged event data containing the edited row and changed values.
version_metadata_col_names List[str] yes Metadata column names that are editable.

handle_add_metadata

Name Type Required Description
selected_rows pd.DataFrame yes DataFrame of currently selected app version rows.
metadata_col_names List[str] yes Existing metadata column names (will be extended if a new key is added).

handle_add_virtual_app

Name Type Required Description
app_name str yes Name of the application to create the virtual version under.
feedback_col_names List[str] yes Available feedback metric names.
feedback_defs Any yes Feedback definitions DataFrame with feedback_name and feedback_definition_id columns.
metadata_col_names List[str] yes Existing metadata column names.

Outputs

_get_nonzero_cost_columns

Name Type Description
cols List[str] List of cost column names that have any nonzero values, ordered USD first then Snowflake Credits.

_preprocess_df

Name Type Description
df pd.DataFrame Aggregated DataFrame grouped by app version, with record counts, latency means, cost sums, feedback means, and joined metadata, rounded to 3 decimal places.

render_leaderboard

Name Type Description
(none -- renders to Streamlit) None Renders the complete leaderboard UI with three sub-tabs.

_render_grid

Name Type Description
selected_rows pd.DataFrame DataFrame of user-selected rows from the grid (may be empty).

Usage Examples

# Example 1: Running the leaderboard as a standalone Streamlit page
# (typically invoked as: streamlit run tabs/Leaderboard.py)
from trulens.dashboard.tabs.Leaderboard import leaderboard_main

if __name__ == "__main__":
    leaderboard_main()
# Example 2: Rendering the leaderboard within a custom Streamlit app
from trulens.dashboard.tabs.Leaderboard import init_page_state, render_leaderboard

init_page_state()
render_leaderboard(app_name="my_rag_pipeline")
# Example 3: Using _get_nonzero_cost_columns to conditionally display cost data
import pandas as pd
from trulens.dashboard.tabs.Leaderboard import _get_nonzero_cost_columns

df = pd.DataFrame({
    "Total Cost (USD)": [0.0, 0.05, 0.12],
    "Total Cost (Snowflake Credits)": [0.0, 0.0, 0.0],
})
cost_cols = _get_nonzero_cost_columns(df)
# Returns: ["Total Cost (USD)"]  -- Snowflake Credits excluded because all zeros
# Example 4: Using _preprocess_df to prepare aggregated leaderboard data
from trulens.dashboard.tabs.Leaderboard import _preprocess_df

aggregated_df = _preprocess_df(
    records_df=raw_records_df,
    app_versions_df=versions_df,
    feedback_col_names=["relevance", "groundedness", "answer_relevance"],
    metadata_col_names=["model_name", "prompt_template"],
    show_all=False,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment