Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Langfuse Langfuse BuildScoreEvent

From Leeroopedia
Knowledge Sources
Domains LLM Evaluation, Data Validation
Last Updated 2026-02-14 00:00 GMT

Overview

Concrete tool for validating LLM judge responses and constructing score creation events provided by Langfuse.

Description

This implementation consists of two primary functions and several supporting utilities in the evalExecutionUtils module:

buildScoreEvent constructs a ScoreEventType object from validated evaluation results. It formats the score data into Langfuse's standard event ingestion format with source set to "EVAL", data type "NUMERIC", and includes full execution metadata for traceability.

validateLLMResponse validates the raw LLM response against a Zod v3 schema built from the evaluation template's output schema. It uses safeParse for non-throwing validation and returns a discriminated union indicating success (with parsed data) or failure (with error message).

buildEvalScoreSchema constructs the Zod v3 validation schema from the eval template's output schema, using the template's score and reasoning descriptions as Zod .describe() annotations that both guide LLM structured output and serve as documentation.

buildEvalMessages wraps the compiled prompt string into a ChatMessage array with User role, ready for the LLM call.

buildExecutionMetadata constructs execution metadata by filtering out null entries from job execution identifiers, producing a clean Record<string, string> for inclusion in score events.

getEnvironmentFromVariables extracts the environment string from the first extracted variable that has one set, providing the deployment context for the score.

Usage

These utilities are called by the executeLLMAsJudgeEvaluation function in the evaluation service. They are pure functions (except for the current timestamp in buildScoreEvent) and are designed for easy unit testing.

Code Reference

Source Location

  • Repository: langfuse
  • File: worker/src/features/evaluation/evalExecutionUtils.ts
  • Lines: 120-139 (buildScoreEvent), 168-180 (validateLLMResponse), 51-56 (buildEvalScoreSchema), 88-96 (buildEvalMessages), 64-80 (buildExecutionMetadata), 148-152 (getEnvironmentFromVariables)

Signature

// Build score event
export function buildScoreEvent(params: BuildScoreEventParams): ScoreEventType;

// Validate LLM response
export function validateLLMResponse(
  params: ValidateLLMResponseParams,
):
  | { success: true; data: { score: number; reasoning: string } }
  | { success: false; error: string };

// Build Zod v3 score schema
export function buildEvalScoreSchema(
  outputSchema: EvalTemplateOutputSchema,
): z.ZodObject<{ reasoning: z.ZodString; score: z.ZodNumber }>;

// Build eval messages
export function buildEvalMessages(
  prompt: string,
): [{ type: ChatMessageType.User; role: ChatMessageRole.User; content: string }];

// Build execution metadata
export function buildExecutionMetadata(params: {
  jobExecutionId: string;
  jobConfigurationId: string;
  targetTraceId?: string | null;
  targetObservationId?: string | null;
  targetDatasetItemId?: string | null;
}): Record<string, string>;

// Get environment from variables
export function getEnvironmentFromVariables(
  variables: ExtractedVariable[],
): string | undefined;

Import

import {
  buildScoreEvent,
  validateLLMResponse,
  buildEvalScoreSchema,
  buildEvalMessages,
  buildExecutionMetadata,
  getEnvironmentFromVariables,
  evalTemplateOutputSchema,
} from "./evalExecutionUtils";

I/O Contract

Inputs

BuildScoreEventParams:

Name Type Required Description
eventId string Yes Unique event identifier (UUID).
scoreId string Yes Unique score identifier (UUID).
traceId string or null Yes The trace ID this score is attached to.
observationId string or null Yes The observation ID this score is attached to (null for trace-level scores).
scoreName string Yes The name of the score (from job configuration).
value number Yes The numeric score value from the LLM judge.
reasoning string Yes The textual reasoning from the LLM judge.
environment string Yes The deployment environment (e.g., "production").
executionTraceId string Yes The W3C trace ID of the evaluation execution's internal trace.
metadata Record<string, string> Yes Execution metadata including job_execution_id, job_configuration_id, etc.

ValidateLLMResponseParams:

Name Type Required Description
response unknown Yes The raw LLM response object from fetchLLMCompletion.
schema ReturnType<typeof buildEvalScoreSchema> Yes The Zod v3 schema to validate against.

Outputs

buildScoreEvent:

Name Type Description
ScoreEventType ScoreEventType A score creation event with type "score-create", source "EVAL", dataType "NUMERIC", current timestamp, and all provided fields mapped to the event body.

validateLLMResponse:

Name Type Description
success result { success: true; data: { score: number; reasoning: string } } When validation passes: the parsed score and reasoning.
failure result { success: false; error: string } When validation fails: the Zod error message describing what went wrong.

Usage Examples

Validating and Building a Score Event

import {
  buildEvalScoreSchema,
  validateLLMResponse,
  buildScoreEvent,
  buildExecutionMetadata,
  evalTemplateOutputSchema,
} from "./evalExecutionUtils";

// 1. Parse the template's output schema
const parsedSchema = evalTemplateOutputSchema.parse({
  score: "Relevance score from 0 to 10",
  reasoning: "Explanation of the relevance score",
});

// 2. Build the Zod v3 validation schema
const evalScoreSchema = buildEvalScoreSchema(parsedSchema);

// 3. Validate the LLM response
const llmOutput = { score: 8, reasoning: "The response is highly relevant..." };
const validated = validateLLMResponse({
  response: llmOutput,
  schema: evalScoreSchema,
});

if (!validated.success) {
  throw new Error(`Invalid LLM response: ${validated.error}`);
}

// 4. Build execution metadata
const metadata = buildExecutionMetadata({
  jobExecutionId: "exec-456",
  jobConfigurationId: "config-123",
  targetTraceId: "trace-789",
  targetObservationId: null,
  targetDatasetItemId: null,
});

// 5. Build the score event
const scoreEvent = buildScoreEvent({
  eventId: "evt-001",
  scoreId: "score-002",
  traceId: "trace-789",
  observationId: null,
  scoreName: "relevance",
  value: validated.data.score,
  reasoning: validated.data.reasoning,
  environment: "production",
  executionTraceId: "eval-trace-456",
  metadata,
});

// scoreEvent:
// {
//   id: "evt-001",
//   timestamp: "2024-01-15T10:30:00.000Z",
//   type: "score-create",
//   body: {
//     id: "score-002",
//     traceId: "trace-789",
//     observationId: null,
//     name: "relevance",
//     value: 8,
//     comment: "The response is highly relevant...",
//     source: "EVAL",
//     environment: "production",
//     executionTraceId: "eval-trace-456",
//     metadata: { job_execution_id: "exec-456", job_configuration_id: "config-123", target_trace_id: "trace-789" },
//     dataType: "NUMERIC",
//   }
// }

Handling Validation Failure

const badOutput = { score: "not a number", reasoning: 42 };
const result = validateLLMResponse({
  response: badOutput,
  schema: evalScoreSchema,
});

if (!result.success) {
  // result.error contains Zod validation error message
  console.error("Validation failed:", result.error);
  // Typically thrown as UnrecoverableError to prevent retry
}

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment