Implementation:Langfuse Langfuse FetchLLMCompletion Eval
| Knowledge Sources | |
|---|---|
| Domains | LLM Integration, LLM Evaluation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for invoking LLM providers with structured output schemas for evaluation judgments provided by Langfuse. This is a wrapper around LangChain provider adapters.
Description
The fetchLLMCompletion function is a unified LLM provider interface that wraps LangChain adapters for six major providers: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI, and Google AI Studio. For evaluation use cases, it is called with streaming: false and a structuredOutputSchema (a Zod v3 schema) to obtain a parsed { score: number, reasoning: string } response.
The function performs several responsibilities:
- Credential decryption -- API keys are stored encrypted in the database and decrypted at the point of use. Extra headers (for API proxies) are also decrypted and parsed.
- Internal tracing -- When traceSinkParams are provided (always the case for evaluations), the function sets up an internal Langfuse tracing handler that records the LLM call as a trace within the user's project. A safety invariant enforces that internal traces must use the "langfuse-" environment prefix.
- Message transformation -- Converts Langfuse ChatMessage objects to LangChain BaseMessage types, handling provider-specific requirements (e.g., Anthropic requires user messages).
- Provider instantiation -- Creates the appropriate LangChain chat model client based on the adapter/provider specified in the model params.
- Structured output -- Uses LangChain's
.withStructuredOutput()to invoke the model with a Zod schema, ensuring the response is parsed into the expected shape. - Error wrapping -- LLM API errors are caught and wrapped in LLMCompletionError with retryable/non-retryable classification based on HTTP status codes.
Note: The function uses Zod v3 schemas (not v4) for structured output due to a compatibility issue with ChatVertexAI.
Usage
This function is called by the evaluation execution pipeline (via the EvalExecutionDeps dependency injection interface) to make the LLM judge call. It is also used by other Langfuse features such as the playground, prompt experiments, and annotation queues.
Code Reference
Source Location
- Repository: langfuse
- File: packages/shared/src/server/llm/fetchLLMCompletion.ts
- Lines: 94-507
Signature
// Overload for structured output (used by evaluations)
export async function fetchLLMCompletion(
params: LLMCompletionParams & {
streaming: false;
structuredOutputSchema: ZodSchema;
},
): Promise<Record<string, unknown>>;
// Overload for plain text output
export async function fetchLLMCompletion(
params: LLMCompletionParams & {
streaming: false;
},
): Promise<string>;
// Overload for streaming output
export async function fetchLLMCompletion(
params: LLMCompletionParams & {
streaming: true;
},
): Promise<IterableReadableStream<Uint8Array>>;
// Overload for tool call output
export async function fetchLLMCompletion(
params: LLMCompletionParams & {
streaming: false;
tools: LLMToolDefinition[];
},
): Promise<ToolCallResponse>;
Import
import { fetchLLMCompletion } from "@langfuse/shared/src/server/llm/fetchLLMCompletion";
I/O Contract
Inputs
LLMCompletionParams:
| Name | Type | Required | Description |
|---|---|---|---|
| messages | ChatMessage[] | Yes | Array of chat messages with role (User, System, Assistant, Developer) and content (string or structured). |
| modelParams | ModelParams | Yes | Model configuration including adapter (provider), model name, temperature, maxTokens, topP, frequencyPenalty, presencePenalty. |
| llmConnection.secretKey | string | Yes | Encrypted API key for the LLM provider. |
| llmConnection.extraHeaders | string or null | No | Encrypted JSON string of additional HTTP headers for API proxies. |
| llmConnection.baseURL | string or null | No | Custom base URL for the LLM provider endpoint. |
| llmConnection.config | Record<string, string> or null | No | Provider-specific configuration (e.g., Bedrock region, VertexAI project/location). |
| structuredOutputSchema | ZodSchema (v3) | No | Zod v3 schema for structured output. When provided with streaming: false, the response is parsed into the schema shape. |
| streaming | boolean | Yes | Whether to stream the response. Evaluations always use false. |
| callbacks | BaseCallbackHandler[] | No | Additional LangChain callback handlers for custom processing. |
| maxRetries | number | No | Maximum number of retry attempts for the LLM API call. |
| traceSinkParams | TraceSinkParams | No | Configuration for internal Langfuse tracing of the LLM call. |
| shouldUseLangfuseAPIKey | boolean | No | Whether to use the Langfuse platform API key instead of a user-provided key. Defaults to false. |
TraceSinkParams (for evaluations):
| Name | Type | Required | Description |
|---|---|---|---|
| targetProjectId | string | Yes | The project ID to create the internal trace in. |
| traceId | string | Yes | A deterministic trace ID derived from the job execution ID. |
| traceName | string | Yes | Human-readable trace name (e.g., "Execute evaluator: Relevance Check"). |
| environment | string | Yes | Must start with "langfuse-" (e.g., "langfuse-llm-judge"). Enforced as a safety invariant. |
| metadata | Record<string, string> | No | Execution metadata including job_execution_id, job_configuration_id, target_trace_id, score_id. |
Outputs
| Name | Type | Description |
|---|---|---|
| Record<string, unknown> | Promise<Record<string, unknown>> | When called with structuredOutputSchema: a parsed object matching the schema shape. For evaluations, this is { score: number, reasoning: string }.
|
| string | Promise<string> | When called without structuredOutputSchema and streaming: false: the raw text response. |
| IterableReadableStream<Uint8Array> | Promise<IterableReadableStream<Uint8Array>> | When called with streaming: true: a byte stream of the response. |
Usage Examples
Evaluation LLM Call with Structured Output
import { fetchLLMCompletion } from "@langfuse/shared/src/server/llm/fetchLLMCompletion";
import { z as zodV3 } from "zod/v3";
import { ChatMessageRole, ChatMessageType, LLMAdapter } from "@langfuse/shared";
const evalScoreSchema = zodV3.object({
reasoning: zodV3.string().describe("Explanation of the relevance score"),
score: zodV3.number().describe("Relevance score from 0 to 10"),
});
const result = await fetchLLMCompletion({
messages: [
{
type: ChatMessageType.User,
role: ChatMessageRole.User,
content: "Rate the relevance of this response...",
},
],
modelParams: {
adapter: LLMAdapter.OpenAI,
model: "gpt-4o",
temperature: 0,
maxTokens: 512,
},
llmConnection: {
secretKey: encryptedApiKey,
extraHeaders: null,
baseURL: null,
config: null,
},
streaming: false,
structuredOutputSchema: evalScoreSchema,
traceSinkParams: {
targetProjectId: "proj-123",
traceId: "eval-trace-789",
traceName: "Execute evaluator: Relevance Check",
environment: "langfuse-llm-judge",
metadata: {
job_execution_id: "exec-456",
job_configuration_id: "config-123",
target_trace_id: "trace-original",
score_id: "score-789",
},
},
});
// result: { score: 8, reasoning: "The response directly addresses..." }