Principle:FlowiseAI Flowise Evaluation Rerun

Property	Value
Principle Name	Evaluation_Rerun
Overview	Technique for re-executing a previous evaluation run to measure improvement after chatflow modifications
Domain	AI Evaluation, Iterative Improvement, Regression Testing
Source	FlowiseAI/Flowise repository: packages/ui/src/api/evaluations.js
Last Updated	2026-02-12 14:00 GMT

Description

After modifying a chatflow (changing prompts, swapping models, adjusting parameters), users re-run the same evaluation to compare results. The re-run creates a new version of the evaluation using the same dataset and evaluators but tests against the updated chatflow. This enables iterative improvement tracking.

The re-run process:

The user identifies an existing evaluation run that was previously executed.
The user modifies the target chatflow (e.g., refines system prompt, changes model, adjusts temperature).
The user triggers a re-run of the evaluation, which:
- Uses the same dataset (identical input/expected-output pairs).
- Uses the same evaluators (identical scoring criteria).
- Tests against the updated chatflow (reflecting recent modifications).
- Creates a new version of the evaluation results.
The new version's results can be compared against previous versions to measure improvement or detect regressions.

This cycle can be repeated as many times as needed, building a version history that tracks the chatflow's quality trajectory over time.

Usage

Use evaluation re-run when re-testing a chatflow after modifications to measure quality changes. This is a critical step in the iterative development workflow:

After adjusting chatflow parameters (prompts, models, tools)
After fixing issues identified in previous evaluation results
As part of a continuous improvement process for chatflow quality

Theoretical Basis

This principle follows the iterative improvement cycle pattern. Each re-run creates a versioned snapshot of evaluation results, enabling before/after comparison. Same inputs ensure consistent testing conditions across versions.

Key aspects of the iterative improvement model:

Controlled variables: By reusing the same dataset and evaluators, the only variable that changes between versions is the chatflow itself. This isolates the impact of chatflow modifications on quality metrics.
Version immutability: Each evaluation version is an immutable snapshot. Previous results are never overwritten, ensuring a complete audit trail of quality changes.
Regression detection: Comparing new results against previous versions reveals not only improvements but also regressions, where changes intended to fix one issue inadvertently degrade another dimension.
Convergence tracking: Over multiple re-runs, the version history reveals whether iterative modifications are converging toward better quality or oscillating without clear improvement.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment