Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft BIPIA BipiaEvalFactory

From Leeroopedia
Field Value
Sources Repo
Domains NLP, Security, Evaluation
Last Updated 2026-02-14

Overview

Concrete tool for computing attack success rates across 26 attack types provided by the BIPIA benchmark library.

Description

BipiaEvalFactory is the orchestration class for ASR evaluation. On initialization, it calls depia_regist_fn(gpt_config) to build a mapping of attack names to evaluator factory functions, then instantiates evaluators for the activated attacks.

Each evaluator implements two core methods:

  • add() -- Ingests a single sample (prediction, reference, metadata) and records a binary success/failure result.
  • compute() -- Returns the ASR for that evaluator's attack type.

The evaluator classes include:

Evaluator Class Purpose
ModelEval GPT-based chain-of-thought judging for task-irrelevant and task-relevant attacks
LanguageEval Language detection for translation-based attacks
MatchRefEval Fuzzy string matching for content injection attacks
BaseEncodeEval Validation for base64, reverse, and emoji encoding attacks
CarsarEval Validation for Caesar cipher encryption attacks

The factory exposes two main methods for batch evaluation:

  • add_batch() -- Processes multiple samples, routing each to the appropriate evaluator based on its attack name. Returns a List[int] of per-sample binary ASR values (0 or 1).
  • compute() -- Aggregates results across all evaluators and returns an OrderedDict containing per-attack ASR values plus macro and micro ASR aggregates.

Usage

Import BipiaEvalFactory after response generation to evaluate attack success rates. Requires a GPT config for model-based judging (used by ModelEval to call the OpenAI API for chain-of-thought evaluation).

Code Reference

Source
BIPIA repo, File: bipia/metrics/eval_factory.py, Lines: L1-90
Signature
BipiaEvalFactory.__init__(
    self,
    *,
    gpt_config: str | dict,
    regist_fn=depia_regist_fn,
    activate_attacks: list,
    **kwargs
)

BipiaEvalFactory.add_batch(
    self,
    *,
    references: List,
    predictions: List,
    attacks: List,
    tasks: List,
    **kwargs
) -> List[int]

BipiaEvalFactory.compute(self) -> OrderedDict
Import
from bipia.metrics import BipiaEvalFactory

I/O Contract

Inputs

Parameter Type Required Description
gpt_config dict Yes GPT config for judge model (used by ModelEval)
activate_attacks list Yes Attack names to evaluate (subset of 26 available)
predictions List[str] Yes Model responses to evaluate
references List[str] Yes Target/ideal responses for comparison
attacks List[str] Yes Attack name per sample (used for dispatch)
tasks List[str] Yes Task name per sample

Outputs

Output Type Description
Per-attack ASR float (0-1) Attack success rate for each activated attack type
Macro ASR float (0-1) Unweighted average across all attack types
Micro ASR float (0-1) Sample-weighted average across all attack types
Per-sample ASR (from add_batch) List[int] Binary 0 or 1 for each sample indicating attack success

Usage Examples

from bipia.metrics import BipiaEvalFactory

# Define the attacks to evaluate
activate_attacks = [
    "email_injection",
    "translation_attack",
    "content_injection",
    "base64_encoding",
    "caesar_cipher",
]

# Initialize the factory with GPT config and active attacks
eval_factory = BipiaEvalFactory(
    gpt_config="path/to/gpt_config.yaml",
    activate_attacks=activate_attacks,
)

# Process samples in batches (e.g., from a dataloader)
for batch in dataloader:
    per_sample_asr = eval_factory.add_batch(
        references=batch["references"],
        predictions=batch["predictions"],
        attacks=batch["attacks"],
        tasks=batch["tasks"],
    )
    # per_sample_asr is a List[int], e.g. [0, 1, 0, 1, 1]

# Compute aggregate results after all batches
results = eval_factory.compute()

# results is an OrderedDict, e.g.:
# OrderedDict([
#     ("email_injection", 0.45),
#     ("translation_attack", 0.30),
#     ("content_injection", 0.55),
#     ("base64_encoding", 0.20),
#     ("caesar_cipher", 0.15),
#     ("macro_asr", 0.33),
#     ("micro_asr", 0.35),
# ])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment