Implementation:Langfuse Langfuse EvalRouter CreateTemplate
| Knowledge Sources | |
|---|---|
| Domains | LLM Evaluation, Configuration Management |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for creating evaluation templates and job configurations provided by Langfuse.
Description
This implementation consists of two tRPC mutation procedures on the eval router:
createTemplate creates a new versioned evaluation template within a project. It validates the LLM model configuration by making a test structured-output call, resolves the template version by finding prior versions with the same name and incrementing, creates the Prisma record inside a transaction, and optionally updates all referencing job configurations to point at the new version. It also supports cloning Langfuse-managed (global) templates into project-level templates.
createJob creates a new job configuration that binds an evaluation template to a target object type with filter, sampling, delay, and time scope settings. It validates the referenced template exists and is accessible to the project, creates the job configuration record with ACTIVE status, clears the "no eval configs" cache so new traces are properly evaluated, and optionally enqueues a batch action to apply evaluations to historical traces when the time scope includes "EXISTING".
Usage
These procedures are called from the Langfuse web UI when a user creates or updates an evaluation template or creates a new evaluator (job configuration). They are invoked via tRPC from the frontend evaluation settings pages.
Code Reference
Source Location
- Repository: langfuse
- File: web/src/features/evals/server/router.ts
- Lines: 794-967 (createTemplate), 709-793 (createJob)
Signature
// createTemplate
evalRouter.createTemplate: protectedProjectProcedure
.input(CreateEvalTemplate)
.mutation(async ({ input, ctx }) => EvalTemplate)
// createJob
evalRouter.createJob: protectedProjectProcedure
.input(CreateEvalJobSchema)
.mutation(async ({ input, ctx }) => void)
Import
// These are tRPC procedures, not directly imported.
// They are accessed via the tRPC client:
import { api } from "@/src/utils/api";
// Usage:
api.evals.createTemplate.mutate({ ... });
api.evals.createJob.mutate({ ... });
I/O Contract
Inputs
CreateEvalTemplate:
| Name | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Unique template name within the project. Used for versioning. |
| projectId | string | Yes | The project this template belongs to. |
| prompt | string | Yes | The evaluation prompt with Template:Variable placeholders. |
| provider | string or null | No | LLM provider identifier. Null uses project default. |
| model | string or null | No | LLM model identifier. Null uses project default. |
| modelParams | ZodModelConfig or null | No | Model parameters (temperature, maxTokens, topP, etc.). |
| vars | string[] | Yes | List of variable names referenced in the prompt template. |
| outputSchema | { score: string, reasoning: string } | Yes | Descriptions for the score and reasoning fields in structured output. |
| cloneSourceId | string | No | ID of a Langfuse-managed template to clone into a project-level template. |
| referencedEvaluators | "UPDATE" or "PERSIST" | No | Whether to update existing job configs referencing prior versions. Defaults to "PERSIST". |
CreateEvalJobSchema:
| Name | Type | Required | Description |
|---|---|---|---|
| projectId | string | Yes | The project this job belongs to. |
| evalTemplateId | string | Yes | Reference to the evaluation template to use. |
| scoreName | string | Yes | Name of the score to produce (min 1 character). |
| target | string | Yes | Target object type: "trace" or "dataset". |
| filter | singleFilter[] or null | No | Array of filter conditions to match traces/datasets. |
| mapping | variableMapping[] | Yes | Maps template variables to trace/observation/dataset columns. |
| sampling | number | Yes | Probability of evaluation (0 < sampling <= 1). |
| delay | number | No | Delay in milliseconds before evaluation. Defaults to DEFAULT_TRACE_JOB_DELAY (10s). |
| timeScope | TimeScopeSchema | Yes | Array of "NEW" and/or "EXISTING" indicating when to apply. |
Outputs
createTemplate:
| Name | Type | Description |
|---|---|---|
| EvalTemplate | Prisma EvalTemplate record | The newly created template with auto-incremented version, all input fields persisted, and audit log entry created. |
createJob:
| Name | Type | Description |
|---|---|---|
| void | void | No return value. Side effects: JobConfiguration record created in DB, no-eval-configs cache cleared. If timeScope includes "EXISTING", a BatchActionQueue job is enqueued. |
Usage Examples
Creating a Relevance Evaluation Template
const template = await api.evals.createTemplate.mutate({
name: "Relevance Check",
projectId: "proj-123",
prompt: "Given the following user query and AI response, rate the relevance of the response on a scale of 0-10.\n\nUser Query: {{input}}\nAI Response: {{output}}",
provider: "openai",
model: "gpt-4o",
modelParams: { temperature: 0, maxTokens: 512 },
vars: ["input", "output"],
outputSchema: {
score: "Relevance score from 0 (completely irrelevant) to 10 (perfectly relevant)",
reasoning: "Explanation of why this score was assigned"
},
});
Creating a Job Configuration with Sampling
await api.evals.createJob.mutate({
projectId: "proj-123",
evalTemplateId: template.id,
scoreName: "relevance",
target: "trace",
filter: [
{ column: "name", operator: "=", value: "chat-completion", type: "string" }
],
mapping: [
{
templateVariable: "input",
langfuseObject: "trace",
selectedColumnId: "input",
jsonSelector: null,
objectName: null,
},
{
templateVariable: "output",
langfuseObject: "trace",
selectedColumnId: "output",
jsonSelector: null,
objectName: null,
},
],
sampling: 0.1, // Evaluate 10% of matching traces
delay: 15000, // Wait 15 seconds for trace data to settle
timeScope: ["NEW", "EXISTING"], // Apply to both new and historical traces
});