Implementation:Langfuse Langfuse BuildScoreEvent

Knowledge Sources	Langfuse
Domains	LLM Evaluation, Data Validation
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for validating LLM judge responses and constructing score creation events provided by Langfuse.

Description

This implementation consists of two primary functions and several supporting utilities in the evalExecutionUtils module:

buildScoreEvent constructs a ScoreEventType object from validated evaluation results. It formats the score data into Langfuse's standard event ingestion format with source set to "EVAL", data type "NUMERIC", and includes full execution metadata for traceability.

validateLLMResponse validates the raw LLM response against a Zod v3 schema built from the evaluation template's output schema. It uses safeParse for non-throwing validation and returns a discriminated union indicating success (with parsed data) or failure (with error message).

buildEvalScoreSchema constructs the Zod v3 validation schema from the eval template's output schema, using the template's score and reasoning descriptions as Zod .describe() annotations that both guide LLM structured output and serve as documentation.

buildEvalMessages wraps the compiled prompt string into a ChatMessage array with User role, ready for the LLM call.

buildExecutionMetadata constructs execution metadata by filtering out null entries from job execution identifiers, producing a clean Record<string, string> for inclusion in score events.

getEnvironmentFromVariables extracts the environment string from the first extracted variable that has one set, providing the deployment context for the score.

Usage

These utilities are called by the executeLLMAsJudgeEvaluation function in the evaluation service. They are pure functions (except for the current timestamp in buildScoreEvent) and are designed for easy unit testing.

Code Reference

Source Location

Repository: langfuse
File: worker/src/features/evaluation/evalExecutionUtils.ts
Lines: 120-139 (buildScoreEvent), 168-180 (validateLLMResponse), 51-56 (buildEvalScoreSchema), 88-96 (buildEvalMessages), 64-80 (buildExecutionMetadata), 148-152 (getEnvironmentFromVariables)

Signature

// Build score event
export function buildScoreEvent(params: BuildScoreEventParams): ScoreEventType;

// Validate LLM response
export function validateLLMResponse(
  params: ValidateLLMResponseParams,
):
  | { success: true; data: { score: number; reasoning: string } }
  | { success: false; error: string };

// Build Zod v3 score schema
export function buildEvalScoreSchema(
  outputSchema: EvalTemplateOutputSchema,
): z.ZodObject<{ reasoning: z.ZodString; score: z.ZodNumber }>;

// Build eval messages
export function buildEvalMessages(
  prompt: string,
): [{ type: ChatMessageType.User; role: ChatMessageRole.User; content: string }];

// Build execution metadata
export function buildExecutionMetadata(params: {
  jobExecutionId: string;
  jobConfigurationId: string;
  targetTraceId?: string | null;
  targetObservationId?: string | null;
  targetDatasetItemId?: string | null;
}): Record<string, string>;

// Get environment from variables
export function getEnvironmentFromVariables(
  variables: ExtractedVariable[],
): string | undefined;

Import

import {
  buildScoreEvent,
  validateLLMResponse,
  buildEvalScoreSchema,
  buildEvalMessages,
  buildExecutionMetadata,
  getEnvironmentFromVariables,
  evalTemplateOutputSchema,
} from "./evalExecutionUtils";

I/O Contract

Inputs

BuildScoreEventParams:

Name	Type	Required	Description
eventId	string	Yes	Unique event identifier (UUID).
scoreId	string	Yes	Unique score identifier (UUID).
traceId	string or null	Yes	The trace ID this score is attached to.
observationId	string or null	Yes	The observation ID this score is attached to (null for trace-level scores).
scoreName	string	Yes	The name of the score (from job configuration).
value	number	Yes	The numeric score value from the LLM judge.
reasoning	string	Yes	The textual reasoning from the LLM judge.
environment	string	Yes	The deployment environment (e.g., "production").
executionTraceId	string	Yes	The W3C trace ID of the evaluation execution's internal trace.
metadata	Record<string, string>	Yes	Execution metadata including job_execution_id, job_configuration_id, etc.

ValidateLLMResponseParams:

Name	Type	Required	Description
response	unknown	Yes	The raw LLM response object from fetchLLMCompletion.
schema	ReturnType<typeof buildEvalScoreSchema>	Yes	The Zod v3 schema to validate against.

Outputs

buildScoreEvent:

Name	Type	Description
ScoreEventType	ScoreEventType	A score creation event with type "score-create", source "EVAL", dataType "NUMERIC", current timestamp, and all provided fields mapped to the event body.

validateLLMResponse:

Name	Type	Description
success result	{ success: true; data: { score: number; reasoning: string } }	When validation passes: the parsed score and reasoning.
failure result	{ success: false; error: string }	When validation fails: the Zod error message describing what went wrong.

Usage Examples

Validating and Building a Score Event

import {
  buildEvalScoreSchema,
  validateLLMResponse,
  buildScoreEvent,
  buildExecutionMetadata,
  evalTemplateOutputSchema,
} from "./evalExecutionUtils";

// 1. Parse the template's output schema
const parsedSchema = evalTemplateOutputSchema.parse({
  score: "Relevance score from 0 to 10",
  reasoning: "Explanation of the relevance score",
});

// 2. Build the Zod v3 validation schema
const evalScoreSchema = buildEvalScoreSchema(parsedSchema);

// 3. Validate the LLM response
const llmOutput = { score: 8, reasoning: "The response is highly relevant..." };
const validated = validateLLMResponse({
  response: llmOutput,
  schema: evalScoreSchema,
});

if (!validated.success) {
  throw new Error(`Invalid LLM response: ${validated.error}`);
}

// 4. Build execution metadata
const metadata = buildExecutionMetadata({
  jobExecutionId: "exec-456",
  jobConfigurationId: "config-123",
  targetTraceId: "trace-789",
  targetObservationId: null,
  targetDatasetItemId: null,
});

// 5. Build the score event
const scoreEvent = buildScoreEvent({
  eventId: "evt-001",
  scoreId: "score-002",
  traceId: "trace-789",
  observationId: null,
  scoreName: "relevance",
  value: validated.data.score,
  reasoning: validated.data.reasoning,
  environment: "production",
  executionTraceId: "eval-trace-456",
  metadata,
});

// scoreEvent:
// {
//   id: "evt-001",
//   timestamp: "2024-01-15T10:30:00.000Z",
//   type: "score-create",
//   body: {
//     id: "score-002",
//     traceId: "trace-789",
//     observationId: null,
//     name: "relevance",
//     value: 8,
//     comment: "The response is highly relevant...",
//     source: "EVAL",
//     environment: "production",
//     executionTraceId: "eval-trace-456",
//     metadata: { job_execution_id: "exec-456", job_configuration_id: "config-123", target_trace_id: "trace-789" },
//     dataType: "NUMERIC",
//   }
// }

Handling Validation Failure

const badOutput = { score: "not a number", reasoning: 42 };
const result = validateLLMResponse({
  response: badOutput,
  schema: evalScoreSchema,
});

if (!result.success) {
  // result.error contains Zod validation error message
  console.error("Validation failed:", result.error);
  // Typically thrown as UnrecoverableError to prevent retry
}

Related Pages

Implements Principle

Principle:Langfuse_Langfuse_Score_Validation_and_Creation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment