Principle:Explodinggradients Ragas Topic Adherence Evaluation

Topic Adherence Evaluation

Topic Adherence Evaluation is the principle of measuring whether an AI agent's responses stay within designated topic boundaries during a multi-turn conversation. This is essential for ensuring that agents remain focused on their intended domain and do not provide information on restricted or out-of-scope subjects.

Theoretical Foundation

Topic Boundaries in Agent Systems

AI agents deployed in specific domains (customer support, medical advice, legal consultation) are often expected to stay within well-defined topic boundaries. Topic adherence evaluation quantifies how well an agent respects these boundaries by comparing the topics actually discussed against a set of reference (allowed) topics.

The evaluation answers two questions:

Did the agent address topics that fall within the allowed set? (relevance)
Did the agent refuse to engage with topics outside the allowed set? (restraint)

Three-Phase Evaluation Pipeline

Topic adherence evaluation employs a multi-phase LLM-based pipeline:

Phase 1: Topic Extraction

An LLM analyzes the full conversation transcript and identifies the distinct topics that were discussed. This produces a list of topic descriptors that characterize what the conversation covered.

Phase 2: Topic Refusal Detection

For each extracted topic, an LLM determines whether the agent actually answered questions about that topic or refused to engage. This produces a boolean "answered" verdict for each topic. Topics where the agent refused to answer are treated differently in the scoring -- a refusal on an out-of-scope topic is correct behavior, while a refusal on an in-scope topic is not.

Phase 3: Topic Classification

An LLM classifies each extracted topic against the reference topic list, determining whether each discussed topic falls within the allowed scope. This produces a boolean "in-scope" classification for each topic.

Scoring Modes

The evaluation supports three scoring modes based on precision, recall, and F1:

Precision: Measures what fraction of the topics the agent actively discussed are within the allowed scope. High precision means the agent rarely discussed off-topic subjects.

precision = TP / (TP + FP)

Recall: Measures what fraction of the in-scope topics the agent actually addressed (rather than refusing). High recall means the agent engaged with most of the relevant topics it should have.

recall = TP / (TP + FN)

F1 (default): The harmonic mean of precision and recall, providing a balanced measure.

F1 = 2 * (precision * recall) / (precision + recall)

Where:

TP (True Positives): Topics that the agent answered AND are classified as in-scope
FP (False Positives): Topics that the agent answered BUT are classified as out-of-scope
FN (False Negatives): Topics that the agent refused to answer BUT are classified as in-scope

Boolean Array Operations

The scoring relies on element-wise boolean operations between two arrays:

topic_answered_verdict: Whether the agent answered each topic (True) or refused (False)
topic_classifications: Whether each topic is in-scope (True) or out-of-scope (False)

The intersection and difference of these arrays yield the TP, FP, and FN counts needed for the chosen scoring mode.

Relationship to Other Concepts

Topic adherence is complementary to Agent Goal Accuracy Evaluation. While goal accuracy evaluates whether the agent completed its task, topic adherence evaluates whether the agent stayed within appropriate boundaries while doing so. An agent might achieve its goal but violate topic boundaries, or stay on-topic but fail to complete the task.

The evaluation operates on multi-turn conversation samples that include reference_topics as the list of allowed topics.

Implemented By

TopicAdherenceScore Metric -- the Ragas metric class that implements this evaluation principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment