Principle:Explodinggradients Ragas Topic Adherence Evaluation
Topic Adherence Evaluation
Topic Adherence Evaluation is the principle of measuring whether an AI agent's responses stay within designated topic boundaries during a multi-turn conversation. This is essential for ensuring that agents remain focused on their intended domain and do not provide information on restricted or out-of-scope subjects.
Theoretical Foundation
Topic Boundaries in Agent Systems
AI agents deployed in specific domains (customer support, medical advice, legal consultation) are often expected to stay within well-defined topic boundaries. Topic adherence evaluation quantifies how well an agent respects these boundaries by comparing the topics actually discussed against a set of reference (allowed) topics.
The evaluation answers two questions:
- Did the agent address topics that fall within the allowed set? (relevance)
- Did the agent refuse to engage with topics outside the allowed set? (restraint)
Three-Phase Evaluation Pipeline
Topic adherence evaluation employs a multi-phase LLM-based pipeline:
Phase 1: Topic Extraction
An LLM analyzes the full conversation transcript and identifies the distinct topics that were discussed. This produces a list of topic descriptors that characterize what the conversation covered.
Phase 2: Topic Refusal Detection
For each extracted topic, an LLM determines whether the agent actually answered questions about that topic or refused to engage. This produces a boolean "answered" verdict for each topic. Topics where the agent refused to answer are treated differently in the scoring -- a refusal on an out-of-scope topic is correct behavior, while a refusal on an in-scope topic is not.
Phase 3: Topic Classification
An LLM classifies each extracted topic against the reference topic list, determining whether each discussed topic falls within the allowed scope. This produces a boolean "in-scope" classification for each topic.
Scoring Modes
The evaluation supports three scoring modes based on precision, recall, and F1:
- Precision: Measures what fraction of the topics the agent actively discussed are within the allowed scope. High precision means the agent rarely discussed off-topic subjects.
precision = TP / (TP + FP)
- Recall: Measures what fraction of the in-scope topics the agent actually addressed (rather than refusing). High recall means the agent engaged with most of the relevant topics it should have.
recall = TP / (TP + FN)
- F1 (default): The harmonic mean of precision and recall, providing a balanced measure.
F1 = 2 * (precision * recall) / (precision + recall)
Where:
- TP (True Positives): Topics that the agent answered AND are classified as in-scope
- FP (False Positives): Topics that the agent answered BUT are classified as out-of-scope
- FN (False Negatives): Topics that the agent refused to answer BUT are classified as in-scope
Boolean Array Operations
The scoring relies on element-wise boolean operations between two arrays:
topic_answered_verdict: Whether the agent answered each topic (True) or refused (False)topic_classifications: Whether each topic is in-scope (True) or out-of-scope (False)
The intersection and difference of these arrays yield the TP, FP, and FN counts needed for the chosen scoring mode.
Relationship to Other Concepts
Topic adherence is complementary to Agent Goal Accuracy Evaluation. While goal accuracy evaluates whether the agent completed its task, topic adherence evaluates whether the agent stayed within appropriate boundaries while doing so. An agent might achieve its goal but violate topic boundaries, or stay on-topic but fail to complete the task.
The evaluation operates on multi-turn conversation samples that include reference_topics as the list of allowed topics.
Implemented By
- TopicAdherenceScore Metric -- the Ragas metric class that implements this evaluation principle
See Also
- Implementation:Explodinggradients_Ragas_TopicAdherenceScore_Metric
- Agent Goal Accuracy Evaluation -- evaluating whether the agent achieved its goal
- Tool Call Accuracy Evaluation -- evaluating individual tool call correctness
- Multi-Turn Evaluation Schema -- the data schema for multi-turn samples