Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Promptfoo Promptfoo Remote Scoring WithPi

From Leeroopedia
Knowledge Sources
Domains Grading, Remote_API
Last Updated 2026-02-14 07:45 GMT

Overview

Concrete tool for scoring LLM outputs using the WithPi API, a third-party evaluation service that grades responses against user-defined scoring specifications.

Description

The Remote_Scoring_WithPi module (remoteScoring.ts) integrates with the WithPi API (api.withpi.ai) to score LLM outputs against custom scoring specifications. Each specification contains questions that are individually scored, producing a total score and per-question breakdown. The module requires a WITHPI_API_KEY environment variable and supports configurable pass thresholds (default 0.5).

Usage

Used when the pi assertion type is specified in test configurations, enabling third-party evaluation of LLM responses.

Code Reference

Source Location

Signature

export async function doRemoteScoringWithPi(
  payload: RemotePiScoringPayload,
  passThreshold?: number,
): Promise<Omit<GradingResult, 'assertion'>>

Import

import { doRemoteScoringWithPi } from './remoteScoring';

I/O Contract

Inputs

Name Type Required Description
payload RemotePiScoringPayload Yes Object with llm_input, llm_output, and scoring_spec
passThreshold number No Minimum total_score to pass (default: 0.5)

Outputs

Name Type Description
result GradingResult (partial) Object with pass, score, namedScores, reason

Usage Examples

import { doRemoteScoringWithPi } from './remoteScoring';

const result = await doRemoteScoringWithPi({
  llm_input: 'What is machine learning?',
  llm_output: 'Machine learning is a subset of AI...',
  scoring_spec: [
    { question: 'Is the response accurate?' },
    { question: 'Is the response complete?' },
  ],
}, 0.7);

console.log(result.score);       // 0.85
console.log(result.namedScores); // { 'Is the response accurate?': 0.9, ... }

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment