Implementation:Langfuse Langfuse BuildScoreEvent
| Knowledge Sources | |
|---|---|
| Domains | LLM Evaluation, Data Validation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for validating LLM judge responses and constructing score creation events provided by Langfuse.
Description
This implementation consists of two primary functions and several supporting utilities in the evalExecutionUtils module:
buildScoreEvent constructs a ScoreEventType object from validated evaluation results. It formats the score data into Langfuse's standard event ingestion format with source set to "EVAL", data type "NUMERIC", and includes full execution metadata for traceability.
validateLLMResponse validates the raw LLM response against a Zod v3 schema built from the evaluation template's output schema. It uses safeParse for non-throwing validation and returns a discriminated union indicating success (with parsed data) or failure (with error message).
buildEvalScoreSchema constructs the Zod v3 validation schema from the eval template's output schema, using the template's score and reasoning descriptions as Zod .describe() annotations that both guide LLM structured output and serve as documentation.
buildEvalMessages wraps the compiled prompt string into a ChatMessage array with User role, ready for the LLM call.
buildExecutionMetadata constructs execution metadata by filtering out null entries from job execution identifiers, producing a clean Record<string, string> for inclusion in score events.
getEnvironmentFromVariables extracts the environment string from the first extracted variable that has one set, providing the deployment context for the score.
Usage
These utilities are called by the executeLLMAsJudgeEvaluation function in the evaluation service. They are pure functions (except for the current timestamp in buildScoreEvent) and are designed for easy unit testing.
Code Reference
Source Location
- Repository: langfuse
- File: worker/src/features/evaluation/evalExecutionUtils.ts
- Lines: 120-139 (buildScoreEvent), 168-180 (validateLLMResponse), 51-56 (buildEvalScoreSchema), 88-96 (buildEvalMessages), 64-80 (buildExecutionMetadata), 148-152 (getEnvironmentFromVariables)
Signature
// Build score event
export function buildScoreEvent(params: BuildScoreEventParams): ScoreEventType;
// Validate LLM response
export function validateLLMResponse(
params: ValidateLLMResponseParams,
):
| { success: true; data: { score: number; reasoning: string } }
| { success: false; error: string };
// Build Zod v3 score schema
export function buildEvalScoreSchema(
outputSchema: EvalTemplateOutputSchema,
): z.ZodObject<{ reasoning: z.ZodString; score: z.ZodNumber }>;
// Build eval messages
export function buildEvalMessages(
prompt: string,
): [{ type: ChatMessageType.User; role: ChatMessageRole.User; content: string }];
// Build execution metadata
export function buildExecutionMetadata(params: {
jobExecutionId: string;
jobConfigurationId: string;
targetTraceId?: string | null;
targetObservationId?: string | null;
targetDatasetItemId?: string | null;
}): Record<string, string>;
// Get environment from variables
export function getEnvironmentFromVariables(
variables: ExtractedVariable[],
): string | undefined;
Import
import {
buildScoreEvent,
validateLLMResponse,
buildEvalScoreSchema,
buildEvalMessages,
buildExecutionMetadata,
getEnvironmentFromVariables,
evalTemplateOutputSchema,
} from "./evalExecutionUtils";
I/O Contract
Inputs
BuildScoreEventParams:
| Name | Type | Required | Description |
|---|---|---|---|
| eventId | string | Yes | Unique event identifier (UUID). |
| scoreId | string | Yes | Unique score identifier (UUID). |
| traceId | string or null | Yes | The trace ID this score is attached to. |
| observationId | string or null | Yes | The observation ID this score is attached to (null for trace-level scores). |
| scoreName | string | Yes | The name of the score (from job configuration). |
| value | number | Yes | The numeric score value from the LLM judge. |
| reasoning | string | Yes | The textual reasoning from the LLM judge. |
| environment | string | Yes | The deployment environment (e.g., "production"). |
| executionTraceId | string | Yes | The W3C trace ID of the evaluation execution's internal trace. |
| metadata | Record<string, string> | Yes | Execution metadata including job_execution_id, job_configuration_id, etc. |
ValidateLLMResponseParams:
| Name | Type | Required | Description |
|---|---|---|---|
| response | unknown | Yes | The raw LLM response object from fetchLLMCompletion. |
| schema | ReturnType<typeof buildEvalScoreSchema> | Yes | The Zod v3 schema to validate against. |
Outputs
buildScoreEvent:
| Name | Type | Description |
|---|---|---|
| ScoreEventType | ScoreEventType | A score creation event with type "score-create", source "EVAL", dataType "NUMERIC", current timestamp, and all provided fields mapped to the event body. |
validateLLMResponse:
| Name | Type | Description |
|---|---|---|
| success result | { success: true; data: { score: number; reasoning: string } } | When validation passes: the parsed score and reasoning. |
| failure result | { success: false; error: string } | When validation fails: the Zod error message describing what went wrong. |
Usage Examples
Validating and Building a Score Event
import {
buildEvalScoreSchema,
validateLLMResponse,
buildScoreEvent,
buildExecutionMetadata,
evalTemplateOutputSchema,
} from "./evalExecutionUtils";
// 1. Parse the template's output schema
const parsedSchema = evalTemplateOutputSchema.parse({
score: "Relevance score from 0 to 10",
reasoning: "Explanation of the relevance score",
});
// 2. Build the Zod v3 validation schema
const evalScoreSchema = buildEvalScoreSchema(parsedSchema);
// 3. Validate the LLM response
const llmOutput = { score: 8, reasoning: "The response is highly relevant..." };
const validated = validateLLMResponse({
response: llmOutput,
schema: evalScoreSchema,
});
if (!validated.success) {
throw new Error(`Invalid LLM response: ${validated.error}`);
}
// 4. Build execution metadata
const metadata = buildExecutionMetadata({
jobExecutionId: "exec-456",
jobConfigurationId: "config-123",
targetTraceId: "trace-789",
targetObservationId: null,
targetDatasetItemId: null,
});
// 5. Build the score event
const scoreEvent = buildScoreEvent({
eventId: "evt-001",
scoreId: "score-002",
traceId: "trace-789",
observationId: null,
scoreName: "relevance",
value: validated.data.score,
reasoning: validated.data.reasoning,
environment: "production",
executionTraceId: "eval-trace-456",
metadata,
});
// scoreEvent:
// {
// id: "evt-001",
// timestamp: "2024-01-15T10:30:00.000Z",
// type: "score-create",
// body: {
// id: "score-002",
// traceId: "trace-789",
// observationId: null,
// name: "relevance",
// value: 8,
// comment: "The response is highly relevant...",
// source: "EVAL",
// environment: "production",
// executionTraceId: "eval-trace-456",
// metadata: { job_execution_id: "exec-456", job_configuration_id: "config-123", target_trace_id: "trace-789" },
// dataType: "NUMERIC",
// }
// }
Handling Validation Failure
const badOutput = { score: "not a number", reasoning: 42 };
const result = validateLLMResponse({
response: badOutput,
schema: evalScoreSchema,
});
if (!result.success) {
// result.error contains Zod validation error message
console.error("Validation failed:", result.error);
// Typically thrown as UnrecoverableError to prevent retry
}