Implementation:Confident ai Deepeval CallbackHandler LangChain
| Knowledge Sources | |
|---|---|
| Domains | |
| Last Updated | 2026-02-14 09:00 GMT |
Overview
Concrete callback handler class that integrates DeepEval with LangChain and LangGraph agent frameworks. The CallbackHandler class inherits from LangChain's BaseCallbackHandler and automatically captures execution traces -- including LLM calls, tool invocations, retriever queries, and chain orchestration events -- translating them into DeepEval's internal trace format for evaluation.
Description
The CallbackHandler is passed into LangChain's config={"callbacks": [handler]} mechanism. Once attached, it receives lifecycle events from the LangChain runtime and builds a hierarchical trace structure. At the end of an agent invocation, the handler can optionally run evaluation metrics against the collected trace and push results to the Confident AI platform.
Key capabilities:
- Automatic trace capture -- intercepts LLM start/end, tool start/end, chain start/end, and retriever start/end events.
- Metric evaluation -- when
metricsormetric_collectionare provided, evaluation runs automatically after each invocation. - Thread tracking -- the
thread_idparameter enables grouping multiple invocations into a single conversation thread. - Metadata and tagging -- supports arbitrary metadata and tags for trace organization and filtering.
Usage
Import and attach to any LangChain agent or chain invocation:
from deepeval.integrations.langchain import CallbackHandler
Code Reference
Source Location
- Repository:
confident-ai/deepeval - File:
deepeval/integrations/langchain/callback.py(lines 70--849)
Signature
class CallbackHandler(BaseCallbackHandler):
def __init__(
self,
name: Optional[str] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
metrics: Optional[List[BaseMetric]] = None,
metric_collection: Optional[str] = None,
test_case_id: Optional[str] = None,
):
...
Import
from deepeval.integrations.langchain import CallbackHandler
Parent Class
BaseCallbackHandlerfromlangchain_core.callbacks.base
I/O Contract
Inputs (Constructor Parameters)
| Name | Type | Default | Description |
|---|---|---|---|
name |
Optional[str] | None |
Human-readable name for the traced agent or chain. |
tags |
Optional[List[str]] | None |
Tags for categorizing and filtering traces. |
metadata |
Optional[Dict[str, Any]] | None |
Arbitrary key-value metadata attached to the trace. |
thread_id |
Optional[str] | None |
Conversation thread identifier for grouping related invocations. |
user_id |
Optional[str] | None |
Identifier for the end user associated with this trace. |
metrics |
Optional[List[BaseMetric]] | None |
List of evaluation metrics to run automatically after each invocation. |
metric_collection |
Optional[str] | None |
Name of a pre-defined metric collection on the Confident AI platform. |
test_case_id |
Optional[str] | None |
Identifier for linking traces to specific test cases. |
Outputs
| Name | Type | Description |
|---|---|---|
| Trace | Internal trace object | Hierarchical trace of the agent execution, including all LLM calls, tool invocations, and chain steps. |
| Metric results | Evaluation scores | When metrics are configured, evaluation results are computed and optionally pushed to the Confident AI platform. |
Usage Examples
Example 1: Basic Agent Instrumentation
Attach the callback handler to a LangChain agent with automatic task completion evaluation.
from deepeval.integrations.langchain import CallbackHandler
from deepeval.metrics import TaskCompletionMetric
handler = CallbackHandler(
metrics=[TaskCompletionMetric()],
name="my-agent",
)
agent.invoke({"input": "Hello"}, config={"callbacks": [handler]})
- The
handleris passed via thecallbackskey in the LangChain config dictionary. - After the agent completes, the
TaskCompletionMetricis automatically evaluated against the captured trace.
Example 2: Threaded Conversation Tracking
Track multiple invocations as part of the same conversation thread.
from deepeval.integrations.langchain import CallbackHandler
handler = CallbackHandler(
name="support-agent",
thread_id="conv-12345",
user_id="user-abc",
tags=["production", "support"],
metadata={"department": "engineering"},
)
# First turn
agent.invoke({"input": "What is my account status?"}, config={"callbacks": [handler]})
# Second turn (same thread)
agent.invoke({"input": "Can you update my email?"}, config={"callbacks": [handler]})
- Both invocations are grouped under the same
thread_id, enabling conversation-level evaluation.