Implementation:Langfuse Langfuse FetchLLMCompletion Eval

Knowledge Sources	Langfuse
Domains	LLM Integration, LLM Evaluation
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for invoking LLM providers with structured output schemas for evaluation judgments provided by Langfuse. This is a wrapper around LangChain provider adapters.

Description

The fetchLLMCompletion function is a unified LLM provider interface that wraps LangChain adapters for six major providers: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI, and Google AI Studio. For evaluation use cases, it is called with streaming: false and a structuredOutputSchema (a Zod v3 schema) to obtain a parsed { score: number, reasoning: string } response.

The function performs several responsibilities:

Credential decryption -- API keys are stored encrypted in the database and decrypted at the point of use. Extra headers (for API proxies) are also decrypted and parsed.
Internal tracing -- When traceSinkParams are provided (always the case for evaluations), the function sets up an internal Langfuse tracing handler that records the LLM call as a trace within the user's project. A safety invariant enforces that internal traces must use the "langfuse-" environment prefix.
Message transformation -- Converts Langfuse ChatMessage objects to LangChain BaseMessage types, handling provider-specific requirements (e.g., Anthropic requires user messages).
Provider instantiation -- Creates the appropriate LangChain chat model client based on the adapter/provider specified in the model params.
Structured output -- Uses LangChain's .withStructuredOutput() to invoke the model with a Zod schema, ensuring the response is parsed into the expected shape.
Error wrapping -- LLM API errors are caught and wrapped in LLMCompletionError with retryable/non-retryable classification based on HTTP status codes.

Note: The function uses Zod v3 schemas (not v4) for structured output due to a compatibility issue with ChatVertexAI.

Usage

This function is called by the evaluation execution pipeline (via the EvalExecutionDeps dependency injection interface) to make the LLM judge call. It is also used by other Langfuse features such as the playground, prompt experiments, and annotation queues.

Code Reference

Source Location

Repository: langfuse
File: packages/shared/src/server/llm/fetchLLMCompletion.ts
Lines: 94-507

Signature

// Overload for structured output (used by evaluations)
export async function fetchLLMCompletion(
  params: LLMCompletionParams & {
    streaming: false;
    structuredOutputSchema: ZodSchema;
  },
): Promise<Record<string, unknown>>;

// Overload for plain text output
export async function fetchLLMCompletion(
  params: LLMCompletionParams & {
    streaming: false;
  },
): Promise<string>;

// Overload for streaming output
export async function fetchLLMCompletion(
  params: LLMCompletionParams & {
    streaming: true;
  },
): Promise<IterableReadableStream<Uint8Array>>;

// Overload for tool call output
export async function fetchLLMCompletion(
  params: LLMCompletionParams & {
    streaming: false;
    tools: LLMToolDefinition[];
  },
): Promise<ToolCallResponse>;

Import

import { fetchLLMCompletion } from "@langfuse/shared/src/server/llm/fetchLLMCompletion";

I/O Contract

Inputs

LLMCompletionParams:

Name	Type	Required	Description
messages	ChatMessage[]	Yes	Array of chat messages with role (User, System, Assistant, Developer) and content (string or structured).
modelParams	ModelParams	Yes	Model configuration including adapter (provider), model name, temperature, maxTokens, topP, frequencyPenalty, presencePenalty.
llmConnection.secretKey	string	Yes	Encrypted API key for the LLM provider.
llmConnection.extraHeaders	string or null	No	Encrypted JSON string of additional HTTP headers for API proxies.
llmConnection.baseURL	string or null	No	Custom base URL for the LLM provider endpoint.
llmConnection.config	Record<string, string> or null	No	Provider-specific configuration (e.g., Bedrock region, VertexAI project/location).
structuredOutputSchema	ZodSchema (v3)	No	Zod v3 schema for structured output. When provided with streaming: false, the response is parsed into the schema shape.
streaming	boolean	Yes	Whether to stream the response. Evaluations always use false.
callbacks	BaseCallbackHandler[]	No	Additional LangChain callback handlers for custom processing.
maxRetries	number	No	Maximum number of retry attempts for the LLM API call.
traceSinkParams	TraceSinkParams	No	Configuration for internal Langfuse tracing of the LLM call.
shouldUseLangfuseAPIKey	boolean	No	Whether to use the Langfuse platform API key instead of a user-provided key. Defaults to false.

TraceSinkParams (for evaluations):

Name	Type	Required	Description
targetProjectId	string	Yes	The project ID to create the internal trace in.
traceId	string	Yes	A deterministic trace ID derived from the job execution ID.
traceName	string	Yes	Human-readable trace name (e.g., "Execute evaluator: Relevance Check").
environment	string	Yes	Must start with "langfuse-" (e.g., "langfuse-llm-judge"). Enforced as a safety invariant.
metadata	Record<string, string>	No	Execution metadata including job_execution_id, job_configuration_id, target_trace_id, score_id.

Outputs

Name	Type	Description
Record<string, unknown>	Promise<Record<string, unknown>>	When called with structuredOutputSchema: a parsed object matching the schema shape. For evaluations, this is `{ score: number, reasoning: string }`.
string	Promise<string>	When called without structuredOutputSchema and streaming: false: the raw text response.
IterableReadableStream<Uint8Array>	Promise<IterableReadableStream<Uint8Array>>	When called with streaming: true: a byte stream of the response.

Usage Examples

Evaluation LLM Call with Structured Output

import { fetchLLMCompletion } from "@langfuse/shared/src/server/llm/fetchLLMCompletion";
import { z as zodV3 } from "zod/v3";
import { ChatMessageRole, ChatMessageType, LLMAdapter } from "@langfuse/shared";

const evalScoreSchema = zodV3.object({
  reasoning: zodV3.string().describe("Explanation of the relevance score"),
  score: zodV3.number().describe("Relevance score from 0 to 10"),
});

const result = await fetchLLMCompletion({
  messages: [
    {
      type: ChatMessageType.User,
      role: ChatMessageRole.User,
      content: "Rate the relevance of this response...",
    },
  ],
  modelParams: {
    adapter: LLMAdapter.OpenAI,
    model: "gpt-4o",
    temperature: 0,
    maxTokens: 512,
  },
  llmConnection: {
    secretKey: encryptedApiKey,
    extraHeaders: null,
    baseURL: null,
    config: null,
  },
  streaming: false,
  structuredOutputSchema: evalScoreSchema,
  traceSinkParams: {
    targetProjectId: "proj-123",
    traceId: "eval-trace-789",
    traceName: "Execute evaluator: Relevance Check",
    environment: "langfuse-llm-judge",
    metadata: {
      job_execution_id: "exec-456",
      job_configuration_id: "config-123",
      target_trace_id: "trace-original",
      score_id: "score-789",
    },
  },
});

// result: { score: 8, reasoning: "The response directly addresses..." }

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment