Implementation:Langfuse Langfuse ExtractVariablesFromTracingData
| Knowledge Sources | |
|---|---|
| Domains | LLM Evaluation, Data Extraction |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for extracting template variable values from trace, observation, and dataset item data provided by Langfuse.
Description
The extractVariablesFromTracingData function resolves evaluation template variable placeholders into concrete string values by querying trace data from ClickHouse and dataset item data from PostgreSQL. It maintains per-invocation caches for traces and observations to minimize redundant database lookups when multiple variables map to the same data source.
The function supports three categories of data sources:
- Trace columns -- Direct columns on the trace record (input, output, metadata, name, userId, sessionId, tags, etc.) fetched from ClickHouse via the getTraceById utility.
- Observation columns -- Columns on observations within the trace (input, output, metadata, model, modelParameters, etc.) fetched from ClickHouse via getObservationForTraceIdByName, which looks up the observation by its name within the trace.
- Dataset item columns -- Columns on dataset items (input, expectedOutput, metadata) fetched from PostgreSQL using Kysely with optional version pinning via the valid_from timestamp.
For each variable, the function finds the corresponding mapping, fetches the data source, extracts the specific column, optionally applies a JSONPath selector for nested JSON data, and converts the result to a string. The function also propagates the environment field from trace/observation data for downstream score creation.
Usage
This function is called by the evaluate function during trace-level evaluation execution, after the job execution, configuration, and template have been fetched and validated. It is also used indirectly by observation-level evaluation through a similar extraction mechanism.
Code Reference
Source Location
- Repository: langfuse
- File: worker/src/features/evaluation/evalService.ts
- Lines: 1022-1230
Signature
export async function extractVariablesFromTracingData({
projectId,
variables,
traceId,
variableMapping,
traceTimestamp,
datasetItemId,
datasetItemValidFrom,
}: {
projectId: string;
variables: string[];
traceId: string;
variableMapping: z.infer<typeof variableMappingList>;
traceTimestamp?: Date;
datasetItemId?: string;
datasetItemValidFrom?: Date;
}): Promise<{ var: string; value: string; environment?: string }[]>
Import
import { extractVariablesFromTracingData } from "../features/evaluation/evalService";
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| projectId | string | Yes | The project ID for scoping all database queries. |
| variables | string[] | Yes | List of template variable names to resolve (e.g., ["input", "output", "expected"]). |
| traceId | string | Yes | The trace ID to extract data from. Used for both trace and observation lookups. |
| variableMapping | variableMappingList | Yes | Array of mapping objects, each specifying: templateVariable, langfuseObject ("trace", "generation", "span", "dataset_item"), selectedColumnId, objectName (for observations), and jsonSelector (optional JSONPath). |
| traceTimestamp | Date | No | The trace's timestamp, used to optimize ClickHouse lookups by narrowing the time range. |
| datasetItemId | string | No | The dataset item ID to fetch when a variable maps to "dataset_item". |
| datasetItemValidFrom | Date | No | The specific version timestamp for the dataset item. When provided, fetches that exact version; otherwise fetches the current (valid_to IS NULL) version. |
Outputs
| Name | Type | Description |
|---|---|---|
| results | Array<{ var: string; value: string; environment?: string }> | Array of resolved variables. Each entry contains the variable name, its string value, and optionally the environment from the source trace/observation. Missing mappings or data produce empty string values. |
Usage Examples
Basic Variable Extraction from Trace
import { extractVariablesFromTracingData } from "../features/evaluation/evalService";
const variables = await extractVariablesFromTracingData({
projectId: "proj-123",
variables: ["input", "output"],
traceId: "trace-456",
variableMapping: [
{
templateVariable: "input",
langfuseObject: "trace",
selectedColumnId: "input",
objectName: null,
jsonSelector: null,
},
{
templateVariable: "output",
langfuseObject: "trace",
selectedColumnId: "output",
objectName: null,
jsonSelector: null,
},
],
traceTimestamp: new Date("2024-01-15T10:00:00Z"),
});
// Result:
// [
// { var: "input", value: "What is the capital of France?", environment: "production" },
// { var: "output", value: "The capital of France is Paris.", environment: "production" }
// ]
Extraction with JSONPath Selector
const variables = await extractVariablesFromTracingData({
projectId: "proj-123",
variables: ["user_message"],
traceId: "trace-456",
variableMapping: [
{
templateVariable: "user_message",
langfuseObject: "generation",
selectedColumnId: "input",
objectName: "chat-completion", // name of the observation
jsonSelector: "$.messages[-1].content", // JSONPath into the input JSON
},
],
});
// Result:
// [{ var: "user_message", value: "Tell me about France" }]
Extraction from Dataset Item
const variables = await extractVariablesFromTracingData({
projectId: "proj-123",
variables: ["expected", "actual_output"],
traceId: "trace-456",
datasetItemId: "item-789",
variableMapping: [
{
templateVariable: "expected",
langfuseObject: "dataset_item",
selectedColumnId: "expected_output",
objectName: null,
jsonSelector: null,
},
{
templateVariable: "actual_output",
langfuseObject: "trace",
selectedColumnId: "output",
objectName: null,
jsonSelector: null,
},
],
});
// Result:
// [
// { var: "expected", value: "Paris is the capital of France." },
// { var: "actual_output", value: "The capital of France is Paris.", environment: "production" }
// ]