Overview
Streamlit dashboard tab that displays a leaderboard of app versions with aggregated performance metrics, feedback scores, cost tracking, and interactive controls for pinning, comparing, metadata editing, and adding virtual apps.
Description
The Leaderboard Tab module (Leaderboard.py) is the primary landing view of the TruLens dashboard. It aggregates record-level data across app versions and presents three sub-tabs:
- App Versions (Grid Tab) -- An interactive AgGrid (or native Streamlit dataframe fallback) showing each app version as a row with columns for record count, average latency, total cost (USD and/or Snowflake Credits), mean feedback scores, and user-defined metadata. The grid supports multi-row selection and action buttons:
- Pin / Unpin App -- toggles a pinned flag on selected app versions for quick filtering.
- Examine Records -- navigates to the Records tab with the selected app IDs pre-filtered.
- Compare -- navigates to the Compare tab with 2--5 selected app versions.
- Add/Edit Metadata -- opens a dialog to set arbitrary key-value metadata on selected app versions.
- Add Virtual App -- opens a dialog to create a synthetic baseline app version with user-specified feedback scores (only available in non-OTEL mode).
- Metric Histograms (Plot Tab) -- A grid of Plotly histograms (one per feedback metric) showing score distributions across all records, with automatic layout into 4 columns.
- List View (List Tab) -- A card-style view where each app version is rendered with Streamlit metrics showing record count, latency, total tokens, cost, and color-coded feedback scores with category icons.
The module handles data preprocessing by grouping records per app version, computing aggregations (count, mean, sum), joining with app-version metadata, and dynamically showing or hiding cost columns based on whether any nonzero values exist.
Usage
Use the Leaderboard Tab as the entry point to the TruLens dashboard for an overview of how all app versions are performing. It is the recommended starting point for identifying regressions, selecting versions to drill into via the Records tab, or setting up comparisons via the Compare tab. The pinning feature allows users to bookmark important baseline versions.
Code Reference
Source Location
Signature
APP_COLS = ["app_version", "app_id", "app_name"]
APP_AGG_COLS = [
"Records",
"Average Latency (s)",
"Total Cost (USD)",
"Total Cost (Snowflake Credits)",
]
def _get_nonzero_cost_columns(df: pd.DataFrame) -> List[str]: ...
def init_page_state() -> None: ...
def _preprocess_df(
records_df: pd.DataFrame,
app_versions_df: pd.DataFrame,
feedback_col_names: List[str],
metadata_col_names: List[str],
show_all: bool = False,
) -> pd.DataFrame: ...
def order_columns(
df: pd.DataFrame,
order: Sequence[str],
) -> pd.DataFrame: ...
def _build_grid_options(
df: pd.DataFrame,
feedback_col_names: List[str],
feedback_directions: Dict[str, bool],
version_metadata_col_names: Sequence[str],
) -> dict: ...
def _render_grid(
df: pd.DataFrame,
feedback_col_names: List[str],
feedback_directions: Dict[str, bool],
version_metadata_col_names: List[str],
grid_key: Optional[str] = None,
) -> pd.DataFrame: ...
def handle_pin_toggle(
selected_app_ids: List[str],
on_leaderboard: bool,
) -> None: ...
def handle_table_edit(
df: pd.DataFrame,
event_data: Dict[str, Any],
version_metadata_col_names: List[str],
) -> None: ...
@streamlit_compat.st_dialog("Add/Edit Metadata")
def handle_add_metadata(
selected_rows: pd.DataFrame,
metadata_col_names: List[str],
) -> None: ...
@streamlit_compat.st_dialog("Add Virtual App")
def handle_add_virtual_app(
app_name: str,
feedback_col_names: List[str],
feedback_defs: Any,
metadata_col_names: List[str],
) -> None: ...
def _render_grid_tab(
df: pd.DataFrame,
feedback_col_names: List[str],
feedback_defs: Any,
feedback_directions: Dict[str, bool],
version_metadata_col_names: List[str],
app_name: str,
grid_key: Optional[str] = None,
) -> None: ...
@streamlit_compat.st_fragment
def _render_list_tab(
df: pd.DataFrame,
feedback_col_names: List[str],
feedback_directions: Dict[str, bool],
version_metadata_col_names: List[str],
max_feedback_cols: int = 6,
) -> None: ...
@streamlit_compat.st_fragment
def _render_plot_tab(
df: pd.DataFrame,
feedback_col_names: List[str],
) -> None: ...
def render_leaderboard(app_name: str) -> None: ...
def leaderboard_main() -> None: ...
Import
from trulens.dashboard.tabs.Leaderboard import render_leaderboard
from trulens.dashboard.tabs.Leaderboard import leaderboard_main
from trulens.dashboard.tabs.Leaderboard import init_page_state
I/O Contract
Inputs
init_page_state
| Name |
Type |
Required |
Description
|
| (none) |
-- |
-- |
Reads LEADERBOARD_PAGE_NAME.only_show_pinned and LEADERBOARD_PAGE_NAME.metadata_cols from query parameters and session state.
|
_get_nonzero_cost_columns
| Name |
Type |
Required |
Description
|
| df |
pd.DataFrame |
yes |
DataFrame containing "Total Cost (USD)" and/or "Total Cost (Snowflake Credits)" columns.
|
_preprocess_df
| Name |
Type |
Required |
Description
|
| records_df |
pd.DataFrame |
yes |
Raw records DataFrame with columns including record_id, latency, total_cost, cost_currency, total_tokens, tags, and feedback columns.
|
| app_versions_df |
pd.DataFrame |
yes |
App versions DataFrame with app_id, app_name, app_version, and metadata columns.
|
| feedback_col_names |
List[str] |
yes |
List of feedback column names to aggregate (mean).
|
| metadata_col_names |
List[str] |
yes |
List of metadata column names to join from the app versions table.
|
| show_all |
bool |
no |
If True, uses a right join to include versions with no records. Defaults to False.
|
render_leaderboard
| Name |
Type |
Required |
Description
|
| app_name |
str |
yes |
The name of the application to render the leaderboard for.
|
handle_pin_toggle
| Name |
Type |
Required |
Description
|
| selected_app_ids |
List[str] |
yes |
List of app IDs whose pinned status should be toggled.
|
| on_leaderboard |
bool |
yes |
Current pinned state; True means the apps are currently pinned (will be unpinned).
|
handle_table_edit
| Name |
Type |
Required |
Description
|
| df |
pd.DataFrame |
yes |
Current leaderboard DataFrame.
|
| event_data |
Dict[str, Any] |
yes |
AgGrid cellValueChanged event data containing the edited row and changed values.
|
| version_metadata_col_names |
List[str] |
yes |
Metadata column names that are editable.
|
handle_add_metadata
| Name |
Type |
Required |
Description
|
| selected_rows |
pd.DataFrame |
yes |
DataFrame of currently selected app version rows.
|
| metadata_col_names |
List[str] |
yes |
Existing metadata column names (will be extended if a new key is added).
|
handle_add_virtual_app
| Name |
Type |
Required |
Description
|
| app_name |
str |
yes |
Name of the application to create the virtual version under.
|
| feedback_col_names |
List[str] |
yes |
Available feedback metric names.
|
| feedback_defs |
Any |
yes |
Feedback definitions DataFrame with feedback_name and feedback_definition_id columns.
|
| metadata_col_names |
List[str] |
yes |
Existing metadata column names.
|
Outputs
_get_nonzero_cost_columns
| Name |
Type |
Description
|
| cols |
List[str] |
List of cost column names that have any nonzero values, ordered USD first then Snowflake Credits.
|
_preprocess_df
| Name |
Type |
Description
|
| df |
pd.DataFrame |
Aggregated DataFrame grouped by app version, with record counts, latency means, cost sums, feedback means, and joined metadata, rounded to 3 decimal places.
|
render_leaderboard
| Name |
Type |
Description
|
| (none -- renders to Streamlit) |
None |
Renders the complete leaderboard UI with three sub-tabs.
|
_render_grid
| Name |
Type |
Description
|
| selected_rows |
pd.DataFrame |
DataFrame of user-selected rows from the grid (may be empty).
|
Usage Examples
# Example 1: Running the leaderboard as a standalone Streamlit page
# (typically invoked as: streamlit run tabs/Leaderboard.py)
from trulens.dashboard.tabs.Leaderboard import leaderboard_main
if __name__ == "__main__":
leaderboard_main()
# Example 2: Rendering the leaderboard within a custom Streamlit app
from trulens.dashboard.tabs.Leaderboard import init_page_state, render_leaderboard
init_page_state()
render_leaderboard(app_name="my_rag_pipeline")
# Example 3: Using _get_nonzero_cost_columns to conditionally display cost data
import pandas as pd
from trulens.dashboard.tabs.Leaderboard import _get_nonzero_cost_columns
df = pd.DataFrame({
"Total Cost (USD)": [0.0, 0.05, 0.12],
"Total Cost (Snowflake Credits)": [0.0, 0.0, 0.0],
})
cost_cols = _get_nonzero_cost_columns(df)
# Returns: ["Total Cost (USD)"] -- Snowflake Credits excluded because all zeros
# Example 4: Using _preprocess_df to prepare aggregated leaderboard data
from trulens.dashboard.tabs.Leaderboard import _preprocess_df
aggregated_df = _preprocess_df(
records_df=raw_records_df,
app_versions_df=versions_df,
feedback_col_names=["relevance", "groundedness", "answer_relevance"],
metadata_col_names=["model_name", "prompt_template"],
show_all=False,
)
Related Pages