Principle:Langfuse Langfuse Experiment Run Creation
| Knowledge Sources | |
|---|---|
| Domains | LLM Experimentation, Job Orchestration |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Experiment run creation is the process of persisting an experiment's full configuration as a dataset run record and dispatching it to an asynchronous processing queue, decoupling the user-facing request from the potentially long-running LLM execution workload.
Description
In an LLM experimentation platform, a single experiment may involve hundreds or thousands of dataset items, each requiring an LLM call, trace creation, and evaluation scheduling. Executing all of this synchronously within an HTTP request would be impractical: the request would time out, the user would have no feedback, and a transient failure would lose all progress.
Experiment run creation solves this by splitting the workflow into two parts:
- Record creation: A persistent database record (the "dataset run") is created that captures the complete experiment specification -- the prompt to use, the model provider and parameters, optional structured output schemas, and the dataset version. This record serves as the single source of truth for the experiment and can be queried by the frontend for status updates.
- Queue dispatch: A job is enqueued onto a durable message queue with retry semantics. The job payload contains only the identifiers needed to reconstruct the experiment context (project ID, dataset ID, run ID). The actual LLM execution is handled by a separate worker process that pulls jobs from the queue.
This pattern provides several important properties:
- Durability: The experiment configuration survives web server restarts because it is persisted in PostgreSQL.
- Retry resilience: The queue system provides configurable retry attempts with exponential backoff, so transient failures (API rate limits, network blips) do not permanently lose the experiment.
- Scalability: Multiple worker instances can consume from the queue concurrently, allowing horizontal scaling of experiment throughput.
- Immediate feedback: The user receives a success response with the run ID almost immediately, and can navigate to the run detail page to watch results appear.
Usage
Use experiment run creation when:
- A user has completed the experiment configuration form and confirmed they want to execute the experiment.
- A programmatic client needs to trigger an experiment and track it by run ID.
- The system needs to guarantee that experiment metadata is preserved even if the worker is temporarily unavailable.
Theoretical Basis
The experiment run creation pattern follows the Command-Query Responsibility Segregation (CQRS) approach combined with an asynchronous work queue pattern.
Step 1 -- Metadata Assembly
The experiment metadata object is constructed from the user's input. It captures the prompt ID, provider, model name, model parameters (temperature, max tokens, top-p, etc.), and optional fields like structured output schema and dataset version. This metadata is stored as a JSON column in the dataset run record so that the worker can later reconstruct the full configuration without any additional user input.
Step 2 -- Database Record Creation
A datasetRuns record is created via the ORM with the following key fields:
- name: The user-specified run name (unique within the dataset for identification).
- description: Optional human-readable description.
- datasetId: Foreign key linking to the dataset.
- projectId: Scoping to the correct project.
- metadata: The JSON blob containing the full experiment configuration, including convenience fields like
experiment_nameandexperiment_run_name.
Step 3 -- Queue Dispatch
A job is added to the experiment creation queue with the following characteristics:
- Queue name: A dedicated named queue for experiment creation jobs.
- Payload: Contains only
projectId,datasetId,runId, and an optionaldescription. All other configuration is retrieved from the database record by the worker. - Retry policy: Up to N attempts with exponential backoff, ensuring resilience against transient failures.
- Idempotency support: The job includes a unique ID and timestamp, and the worker performs deduplication at the item level.
FUNCTION createExperiment(input):
metadata = {
prompt_id, provider, model, model_params,
structured_output_schema?, dataset_version?,
experiment_name, experiment_run_name
}
datasetRun = database.create("datasetRuns", {
name: input.runName,
description: input.description,
datasetId: input.datasetId,
projectId: input.projectId,
metadata: metadata,
})
queue.add("ExperimentCreate", {
projectId, datasetId, runId: datasetRun.id
}, retries=10, backoff=exponential(10s))
RETURN { success: true, runId: datasetRun.id }