Workflow:Wandb Weave LLM Integration Tracing

Knowledge Sources	Wandb Weave Weave Documentation
Domains	LLM_Ops, Observability, Integrations
Last Updated	2026-02-14 11:00 GMT

Overview

End-to-end process for automatically tracing LLM provider API calls (OpenAI, Anthropic, Google, Cohere, and 20+ others) through Weave's integration patching system.

Description

This workflow covers how Weave's integration system transparently intercepts calls to popular LLM provider SDKs and logs them as traced operations. The system uses a two-phase patching strategy: implicit patching (automatic via import hooks) and explicit patching (manual function calls). When patched, every API call to a supported provider is automatically captured with full request and response details, token usage statistics, latency measurements, and streaming chunk accumulation. No changes to existing application code are required beyond initializing Weave.

Usage

Execute this workflow when you are using one or more LLM provider SDKs (such as OpenAI, Anthropic, Google GenAI, Cohere, Mistral, Groq, AWS Bedrock, or others) and want to capture all API interactions without modifying your existing calling code. This is useful for debugging prompt behavior, monitoring token costs, analyzing latency, and auditing LLM usage in production.

Execution Steps

Step 1: Initialize Weave

Initialize the Weave client with a project name. By default, this enables implicit patching, which registers an import hook that intercepts future imports of supported LLM libraries and patches their API methods automatically.

Key considerations:

Implicit patching is enabled by default; it can be disabled via the implicitly_patch_integrations setting
If the LLM SDK is already imported before weave.init(), the system checks sys.modules and patches retroactively
The import hook uses Python's sys.meta_path mechanism for zero-overhead interception

Step 2: Configure Integration Settings

Optionally configure integration-specific settings to control what gets traced. Each integration supports settings for enabling or disabling specific operations, controlling what metadata is captured, and customizing how calls are displayed in the UI.

Key considerations:

Default settings trace all operations for all enabled integrations
Explicit patching allows per-integration settings: weave.integrations.patch_openai(settings)
Settings control whether to capture full request/response bodies, token usage, or just metadata
Multiple integrations can be configured independently

Step 3: Make LLM API Calls

Use the LLM provider SDK normally. The patched methods intercept each call, create a Weave trace record, execute the original API call, and log the result. Both synchronous and streaming calls are supported.

Key considerations:

Standard (non-streaming) calls log the complete request and response as a single trace entry
Streaming calls use an accumulator to collect chunks and log the final aggregated result
Async variants are supported alongside their sync counterparts
Tool-calling and function-calling patterns are captured with full argument details

Step 4: Review Traced Calls

Examine the captured traces in the Weave UI. Each LLM call appears as a traced operation with the provider name, model identifier, input messages, output response, token usage breakdown, and latency. Calls made within traced functions appear as children in the call tree.

Key considerations:

Token usage (prompt tokens, completion tokens, total tokens) is extracted from provider responses
Custom display names show the model name for quick identification
Streaming responses show the fully accumulated result, not individual chunks
Integration traces integrate seamlessly with @weave.op-decorated function call trees

Step 5: Undo Patching

Optionally remove the integration patches when they are no longer needed. Each patcher provides an undo_patch() method that restores the original SDK methods. This is primarily used in testing contexts.

Key considerations:

In production, patching typically remains active for the lifetime of the process
In tests, use fixture-based setup and teardown for clean isolation
Undoing patches restores the original SDK behavior completely

Execution Diagram

GitHub URL

Workflow Repository