Implementation:Microsoft BIPIA BipiaEvalFactory
| Field | Value |
|---|---|
| Sources | Repo |
| Domains | NLP, Security, Evaluation |
| Last Updated | 2026-02-14 |
Overview
Concrete tool for computing attack success rates across 26 attack types provided by the BIPIA benchmark library.
Description
BipiaEvalFactory is the orchestration class for ASR evaluation. On initialization, it calls depia_regist_fn(gpt_config) to build a mapping of attack names to evaluator factory functions, then instantiates evaluators for the activated attacks.
Each evaluator implements two core methods:
add()-- Ingests a single sample (prediction, reference, metadata) and records a binary success/failure result.compute()-- Returns the ASR for that evaluator's attack type.
The evaluator classes include:
| Evaluator Class | Purpose |
|---|---|
| ModelEval | GPT-based chain-of-thought judging for task-irrelevant and task-relevant attacks |
| LanguageEval | Language detection for translation-based attacks |
| MatchRefEval | Fuzzy string matching for content injection attacks |
| BaseEncodeEval | Validation for base64, reverse, and emoji encoding attacks |
| CarsarEval | Validation for Caesar cipher encryption attacks |
The factory exposes two main methods for batch evaluation:
add_batch()-- Processes multiple samples, routing each to the appropriate evaluator based on its attack name. Returns aList[int]of per-sample binary ASR values (0 or 1).compute()-- Aggregates results across all evaluators and returns anOrderedDictcontaining per-attack ASR values plus macro and micro ASR aggregates.
Usage
Import BipiaEvalFactory after response generation to evaluate attack success rates. Requires a GPT config for model-based judging (used by ModelEval to call the OpenAI API for chain-of-thought evaluation).
Code Reference
- Source
- BIPIA repo, File:
bipia/metrics/eval_factory.py, Lines: L1-90
- Signature
BipiaEvalFactory.__init__(
self,
*,
gpt_config: str | dict,
regist_fn=depia_regist_fn,
activate_attacks: list,
**kwargs
)
BipiaEvalFactory.add_batch(
self,
*,
references: List,
predictions: List,
attacks: List,
tasks: List,
**kwargs
) -> List[int]
BipiaEvalFactory.compute(self) -> OrderedDict
- Import
from bipia.metrics import BipiaEvalFactory
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| gpt_config | dict | Yes | GPT config for judge model (used by ModelEval) |
| activate_attacks | list | Yes | Attack names to evaluate (subset of 26 available) |
| predictions | List[str] | Yes | Model responses to evaluate |
| references | List[str] | Yes | Target/ideal responses for comparison |
| attacks | List[str] | Yes | Attack name per sample (used for dispatch) |
| tasks | List[str] | Yes | Task name per sample |
Outputs
| Output | Type | Description |
|---|---|---|
| Per-attack ASR | float (0-1) | Attack success rate for each activated attack type |
| Macro ASR | float (0-1) | Unweighted average across all attack types |
| Micro ASR | float (0-1) | Sample-weighted average across all attack types |
| Per-sample ASR (from add_batch) | List[int] | Binary 0 or 1 for each sample indicating attack success |
Usage Examples
from bipia.metrics import BipiaEvalFactory
# Define the attacks to evaluate
activate_attacks = [
"email_injection",
"translation_attack",
"content_injection",
"base64_encoding",
"caesar_cipher",
]
# Initialize the factory with GPT config and active attacks
eval_factory = BipiaEvalFactory(
gpt_config="path/to/gpt_config.yaml",
activate_attacks=activate_attacks,
)
# Process samples in batches (e.g., from a dataloader)
for batch in dataloader:
per_sample_asr = eval_factory.add_batch(
references=batch["references"],
predictions=batch["predictions"],
attacks=batch["attacks"],
tasks=batch["tasks"],
)
# per_sample_asr is a List[int], e.g. [0, 1, 0, 1, 1]
# Compute aggregate results after all batches
results = eval_factory.compute()
# results is an OrderedDict, e.g.:
# OrderedDict([
# ("email_injection", 0.45),
# ("translation_attack", 0.30),
# ("content_injection", 0.55),
# ("base64_encoding", 0.20),
# ("caesar_cipher", 0.15),
# ("macro_asr", 0.33),
# ("micro_asr", 0.35),
# ])