Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Confident ai Deepeval CallbackHandler LangChain

From Leeroopedia
Metadata
Knowledge Sources
Domains
Last Updated 2026-02-14 09:00 GMT

Overview

Concrete callback handler class that integrates DeepEval with LangChain and LangGraph agent frameworks. The CallbackHandler class inherits from LangChain's BaseCallbackHandler and automatically captures execution traces -- including LLM calls, tool invocations, retriever queries, and chain orchestration events -- translating them into DeepEval's internal trace format for evaluation.

Description

The CallbackHandler is passed into LangChain's config={"callbacks": [handler]} mechanism. Once attached, it receives lifecycle events from the LangChain runtime and builds a hierarchical trace structure. At the end of an agent invocation, the handler can optionally run evaluation metrics against the collected trace and push results to the Confident AI platform.

Key capabilities:

  • Automatic trace capture -- intercepts LLM start/end, tool start/end, chain start/end, and retriever start/end events.
  • Metric evaluation -- when metrics or metric_collection are provided, evaluation runs automatically after each invocation.
  • Thread tracking -- the thread_id parameter enables grouping multiple invocations into a single conversation thread.
  • Metadata and tagging -- supports arbitrary metadata and tags for trace organization and filtering.

Usage

Import and attach to any LangChain agent or chain invocation:

from deepeval.integrations.langchain import CallbackHandler

Code Reference

Source Location

  • Repository: confident-ai/deepeval
  • File: deepeval/integrations/langchain/callback.py (lines 70--849)

Signature

class CallbackHandler(BaseCallbackHandler):
    def __init__(
        self,
        name: Optional[str] = None,
        tags: Optional[List[str]] = None,
        metadata: Optional[Dict[str, Any]] = None,
        thread_id: Optional[str] = None,
        user_id: Optional[str] = None,
        metrics: Optional[List[BaseMetric]] = None,
        metric_collection: Optional[str] = None,
        test_case_id: Optional[str] = None,
    ):
        ...

Import

from deepeval.integrations.langchain import CallbackHandler

Parent Class

  • BaseCallbackHandler from langchain_core.callbacks.base

I/O Contract

Inputs (Constructor Parameters)

Input Contract
Name Type Default Description
name Optional[str] None Human-readable name for the traced agent or chain.
tags Optional[List[str]] None Tags for categorizing and filtering traces.
metadata Optional[Dict[str, Any]] None Arbitrary key-value metadata attached to the trace.
thread_id Optional[str] None Conversation thread identifier for grouping related invocations.
user_id Optional[str] None Identifier for the end user associated with this trace.
metrics Optional[List[BaseMetric]] None List of evaluation metrics to run automatically after each invocation.
metric_collection Optional[str] None Name of a pre-defined metric collection on the Confident AI platform.
test_case_id Optional[str] None Identifier for linking traces to specific test cases.

Outputs

Output Contract
Name Type Description
Trace Internal trace object Hierarchical trace of the agent execution, including all LLM calls, tool invocations, and chain steps.
Metric results Evaluation scores When metrics are configured, evaluation results are computed and optionally pushed to the Confident AI platform.

Usage Examples

Example 1: Basic Agent Instrumentation

Attach the callback handler to a LangChain agent with automatic task completion evaluation.

from deepeval.integrations.langchain import CallbackHandler
from deepeval.metrics import TaskCompletionMetric

handler = CallbackHandler(
    metrics=[TaskCompletionMetric()],
    name="my-agent",
)
agent.invoke({"input": "Hello"}, config={"callbacks": [handler]})
  • The handler is passed via the callbacks key in the LangChain config dictionary.
  • After the agent completes, the TaskCompletionMetric is automatically evaluated against the captured trace.

Example 2: Threaded Conversation Tracking

Track multiple invocations as part of the same conversation thread.

from deepeval.integrations.langchain import CallbackHandler

handler = CallbackHandler(
    name="support-agent",
    thread_id="conv-12345",
    user_id="user-abc",
    tags=["production", "support"],
    metadata={"department": "engineering"},
)

# First turn
agent.invoke({"input": "What is my account status?"}, config={"callbacks": [handler]})

# Second turn (same thread)
agent.invoke({"input": "Can you update my email?"}, config={"callbacks": [handler]})
  • Both invocations are grouped under the same thread_id, enabling conversation-level evaluation.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment