Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Confident ai Deepeval Evaluate Trace

From Leeroopedia

Overview

Evaluate Trace covers the implementation functions for offline evaluation of previously collected traces, spans, and threads against named metric collections on the Confident AI platform. This includes three functions -- evaluate_trace, evaluate_span, and evaluate_thread -- that submit evaluation requests at different granularities without re-running the original application.

API Documentation

Function: evaluate_trace

Source: deepeval/tracing/offline_evals/trace.py

Import:

from deepeval.tracing import evaluate_trace

Signature:

evaluate_trace(trace_uuid: str, metric_collection: str)
Parameter Type Description
trace_uuid str The unique identifier of the trace to evaluate.
metric_collection str The name of the metric collection on Confident AI to apply.

Function: evaluate_span

Source: deepeval/tracing/offline_evals/span.py

Import:

from deepeval.tracing import evaluate_span

Signature:

evaluate_span(span_uuid: str, metric_collection: str)
Parameter Type Description
span_uuid str The unique identifier of the span to evaluate.
metric_collection str The name of the metric collection on Confident AI to apply.

Function: evaluate_thread

Source: deepeval/tracing/offline_evals/thread.py

Import:

from deepeval.tracing import evaluate_thread

Signature:

evaluate_thread(thread_id: str, metric_collection: str, overwrite_metrics: bool = False)
Parameter Type Description
thread_id str The identifier of the conversation thread to evaluate.
metric_collection str The name of the metric collection on Confident AI to apply.
overwrite_metrics bool When True, overwrites any existing metric results for this thread. Defaults to False.

Input / Output (All Functions)

  • Inputs: A target identifier (trace UUID, span UUID, or thread ID) and the name of a metric collection defined on the Confident AI platform.
  • Outputs: An evaluation request is submitted to Confident AI. The evaluation results appear in the Confident AI dashboard associated with the specified trace, span, or thread.

Usage Examples

Evaluating a Trace

from deepeval.tracing import evaluate_trace

evaluate_trace(trace_uuid="abc-123", metric_collection="quality-checks")

Evaluating a Span

from deepeval.tracing import evaluate_span

evaluate_span(span_uuid="def-456", metric_collection="retrieval-quality")

Evaluating a Thread

from deepeval.tracing import evaluate_thread

evaluate_thread(
    thread_id="thread-789",
    metric_collection="conversation-quality",
    overwrite_metrics=True,
)

Relationships

Principle:Confident_ai_Deepeval_Offline_Trace_Evaluation

Metadata

DeepEval Tracing Observability LLM_Evaluation 2026-02-14 09:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment