Implementation:Confident ai Deepeval Evaluate Trace

Overview

Evaluate Trace covers the implementation functions for offline evaluation of previously collected traces, spans, and threads against named metric collections on the Confident AI platform. This includes three functions -- evaluate_trace, evaluate_span, and evaluate_thread -- that submit evaluation requests at different granularities without re-running the original application.

API Documentation

Function: evaluate_trace

Source: deepeval/tracing/offline_evals/trace.py

Import:

from deepeval.tracing import evaluate_trace

Signature:

evaluate_trace(trace_uuid: str, metric_collection: str)

Parameter	Type	Description
`trace_uuid`	`str`	The unique identifier of the trace to evaluate.
`metric_collection`	`str`	The name of the metric collection on Confident AI to apply.

Function: evaluate_span

Source: deepeval/tracing/offline_evals/span.py

Import:

from deepeval.tracing import evaluate_span

Signature:

evaluate_span(span_uuid: str, metric_collection: str)

Parameter	Type	Description
`span_uuid`	`str`	The unique identifier of the span to evaluate.
`metric_collection`	`str`	The name of the metric collection on Confident AI to apply.

Function: evaluate_thread

Source: deepeval/tracing/offline_evals/thread.py

Import:

from deepeval.tracing import evaluate_thread

Signature:

evaluate_thread(thread_id: str, metric_collection: str, overwrite_metrics: bool = False)

Parameter	Type	Description
`thread_id`	`str`	The identifier of the conversation thread to evaluate.
`metric_collection`	`str`	The name of the metric collection on Confident AI to apply.
`overwrite_metrics`	`bool`	When `True`, overwrites any existing metric results for this thread. Defaults to `False`.

Input / Output (All Functions)

Inputs: A target identifier (trace UUID, span UUID, or thread ID) and the name of a metric collection defined on the Confident AI platform.
Outputs: An evaluation request is submitted to Confident AI. The evaluation results appear in the Confident AI dashboard associated with the specified trace, span, or thread.

Usage Examples

Evaluating a Trace

from deepeval.tracing import evaluate_trace

evaluate_trace(trace_uuid="abc-123", metric_collection="quality-checks")

Evaluating a Span

from deepeval.tracing import evaluate_span

evaluate_span(span_uuid="def-456", metric_collection="retrieval-quality")

Evaluating a Thread

from deepeval.tracing import evaluate_thread

evaluate_thread(
    thread_id="thread-789",
    metric_collection="conversation-quality",
    overwrite_metrics=True,
)

Relationships

Principle:Confident_ai_Deepeval_Offline_Trace_Evaluation

Metadata

DeepEval Tracing Observability LLM_Evaluation 2026-02-14 09:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment