Principle:Truera Trulens Feedback Function Definition
| Knowledge Sources | |
|---|---|
| Domains | LLM_Evaluation, Observability |
| Last Updated | 2026-02-14 08:00 GMT |
Overview
A composable evaluation specification pattern that binds an LLM judge function to specific data selectors from application traces to define automated quality metrics.
Description
Feedback Function Definition is the core evaluation specification mechanism in TruLens. A feedback function (now called a Metric) consists of three parts:
- Implementation: The actual evaluation function (e.g., context relevance scorer from a provider)
- Selectors: Bindings that specify which parts of the application trace to extract as inputs
- Aggregation: An optional function to combine multiple evaluation results into a single score
The composable API allows chaining selector methods like .on_input(), .on_output(), and .on() to bind function parameters to specific trace data. This decouples what to evaluate from where the data comes from.
The RAG Triad is the canonical example: three feedback functions evaluating Answer Relevance, Context Relevance, and Groundedness — together providing comprehensive RAG quality assessment.
Usage
Use this principle after configuring a feedback provider. Define feedback functions when you need automated, repeatable quality evaluation of application traces. The RAG Triad (answer relevance, context relevance, groundedness) is recommended as a baseline for any RAG application.
Theoretical Basis
Feedback functions implement a declarative evaluation specification pattern:
Pseudo-code Logic:
# Abstract feedback function definition
metric = Metric(
implementation=judge_function, # What to evaluate with
aggregation=aggregator # How to combine multiple results
).bind(
param_1=selector_for_input, # Where to get first argument
param_2=selector_for_context # Where to get second argument
)
# Result: a fully-specified, executable evaluation unit
The RAG Triad evaluates three orthogonal quality dimensions:
- Answer Relevance: Does the answer address the question? (input -> output)
- Context Relevance: Is the retrieved context relevant to the question? (input -> context)
- Groundedness: Is the answer supported by the retrieved context? (context -> output)
These three metrics form a complete quality assessment because any RAG failure must manifest in at least one of these dimensions.