Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Promptfoo Promptfoo Evaluation Execution

From Leeroopedia
Knowledge Sources
Domains Evaluation, Testing
Last Updated 2026-02-14 08:00 GMT

Overview

An evaluation orchestration mechanism that executes all combinations of prompts, providers, and test cases with controlled concurrency, collecting graded results.

Description

Evaluation Execution is the core runtime phase of LLM testing. Given a fully resolved TestSuite, the evaluator generates a matrix of all (prompt x provider x test) combinations and executes them in parallel with configurable concurrency limits.

For each combination, the evaluator:

  • Renders the prompt template with test variables using Nunjucks
  • Calls the provider's API with the rendered prompt
  • Runs all assertions against the provider's response
  • Records latency, token usage, cost, and grading results
  • Supports multi-turn conversations and abort signals

This mechanism solves the challenge of efficiently running potentially thousands of test-provider-prompt combinations while managing rate limits, progress reporting, and error recovery.

Usage

Use this principle for running the actual evaluation after configuration loading, provider resolution, and test suite construction. This is the fourth step in the evaluation pipeline and the most computationally intensive.

Theoretical Basis

Pseudo-code Logic:

1. Generate evaluation matrix: prompts × providers × tests × repeats
2. For each cell in matrix (with concurrency control):
   a. Render prompt template with test vars
   b. Call provider.callApi(renderedPrompt)
   c. Apply output transforms if configured
   d. Run assertions against response
   e. Record result: { pass/fail, score, latency, cost, tokens }
3. Aggregate results into Eval record
4. Update progress bar and emit events
5. Return completed Eval with all results

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment