Implementation:FlowiseAI Flowise CreateEvaluation
Appearance
| Property | Value |
|---|---|
| Implementation Name | CreateEvaluation |
| Implements | Principle:FlowiseAI_Flowise_Evaluation_Run_Creation |
| Source | packages/ui/src/api/evaluations.js |
| Repository | FlowiseAI/Flowise |
| Domain | API Client, Evaluation Orchestration |
| Last Updated | 2026-02-12 14:00 GMT |
Code Reference
Source Location
The evaluation creation API function is defined in packages/ui/src/api/evaluations.js at line 7.
Signature
// packages/ui/src/api/evaluations.js:L7
const createEvaluation = (body) => client.post(`/evaluations`, body)
The API client is configured at packages/ui/src/api/client.js with a base URL of ${baseURL}/api/v1, making the full endpoint:
POST /api/v1/evaluations
Import
import evaluationApi from '@/api/evaluations'
I/O Contract
createEvaluation
Inputs:
body(Object):name(string, required): Display name for the evaluation runevaluationType(string, required): One of'llm'or'benchmarking'datasetId(string, required): ID of the dataset to evaluate againstdatasetName(string, required): Display name of the datasetchatflowId(string, required): JSON-encoded array of chatflow IDs to evaluatechatflowName(string, required): JSON-encoded array of chatflow display nameschatflowType(string, required): JSON-encoded array of chatflow typesselectedSimpleEvaluators(string[], required): Array of simple evaluator IDs (text, JSON, numeric)selectedLLMEvaluators(string[], required): Array of LLM evaluator IDsmodel(string, optional): Model identifier for LLM-based evaluationllm(string, optional): LLM provider node namecredentialId(string, optional): Credential ID for the LLM providerdatasetAsOneConversation(boolean, required): Whether to send all rows as a single conversation
Outputs:
Promise<{data: EvaluationRun[]}>: Resolves with an array of evaluation run objects, each containing:id(string): Unique identifier for the evaluation runname(string): Display name of the evaluationversion(number): Version number of the runstatus(string): Current status of the evaluation runrunDate(string): ISO timestamp of when the run was executedaverage_metrics(Object): Aggregated metrics across all rows
Usage Examples
Creating a Benchmarking Evaluation
import evaluationApi from '@/api/evaluations'
// Run a benchmarking evaluation with simple evaluators
const response = await evaluationApi.createEvaluation({
name: 'Support Bot Quality Check v1',
evaluationType: 'benchmarking',
datasetId: 'dataset-abc-123',
datasetName: 'Customer Support QA',
chatflowId: JSON.stringify(['chatflow-001']),
chatflowName: JSON.stringify(['Support Bot v2']),
chatflowType: JSON.stringify(['chatflow']),
selectedSimpleEvaluators: ['evaluator-text-001', 'evaluator-numeric-002'],
selectedLLMEvaluators: [],
datasetAsOneConversation: false
})
const evaluationRuns = response.data
Creating an LLM-Graded Evaluation with Multiple Chatflows
import evaluationApi from '@/api/evaluations'
// Compare two chatflows using LLM-based grading
const response = await evaluationApi.createEvaluation({
name: 'A/B Test - GPT-4 vs Claude',
evaluationType: 'llm',
datasetId: 'dataset-xyz-456',
datasetName: 'Product FAQ Tests',
chatflowId: JSON.stringify(['chatflow-gpt4', 'chatflow-claude']),
chatflowName: JSON.stringify(['GPT-4 Bot', 'Claude Bot']),
chatflowType: JSON.stringify(['chatflow', 'chatflow']),
selectedSimpleEvaluators: ['evaluator-latency-001'],
selectedLLMEvaluators: ['evaluator-llm-relevance'],
model: 'gpt-4',
llm: 'chatOpenAI',
credentialId: 'cred-openai-001',
datasetAsOneConversation: false
})
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment