Implementation:Lakeraai Pint benchmark Pint Benchmark Function
| Knowledge Sources | |
|---|---|
| Domains | Model_Evaluation, Benchmarking, Prompt_Injection |
| Last Updated | 2026-02-14 14:00 GMT |
Overview
Concrete tool for running the PINT Benchmark evaluation loop, computing per-category accuracy, and producing balanced or imbalanced scores.
Description
The pint_benchmark function is the main entry point for benchmark execution. Defined in the Jupyter notebook benchmark/pint-benchmark.ipynb, it:
- Delegates row-by-row evaluation to the
evaluate_datasethelper function - Computes a balanced or imbalanced accuracy score from the grouped results
- Optionally prints a formatted results table to stdout
- Returns a tuple of
(model_name, score, benchmark_dataframe)
The function accepts any callable conforming to the eval_function(prompt: str) -> bool interface, making it agnostic to the underlying detection system.
Usage
Call this function from the PINT Benchmark notebook after loading the dataset and preparing an evaluation function. It is used identically across all three workflows (HF model, custom system, custom dataset) with only the eval_function and df parameters changing.
Code Reference
Source Location
- Repository: pint-benchmark
- File: benchmark/pint-benchmark.ipynb (cell-17)
- Helper: benchmark/pint-benchmark.ipynb (cell-15,
evaluate_datasetfunction)
Signature
def pint_benchmark(
df: pd.DataFrame,
model_name: str,
eval_function: Callable[[str], float] = evaluate_lakera_guard,
quiet: bool = False,
weight: Literal["balanced", "imbalanced"] = "balanced",
) -> tuple[str, float, pd.DataFrame]:
"""
Evaluate a model on a dataset and print the benchmark results.
Args:
df: DataFrame with the dataset. Should contain columns "text" and "label".
model_name: Name of the model being evaluated, for display purposes.
eval_function: Function that takes a prompt and returns a boolean prediction.
quiet: If True, suppresses printing of benchmark results.
weight: If "imbalanced", score = correct / total.
If "balanced", score = mean of per-label accuracies.
Returns:
Tuple of (model_name, score, benchmark_results_dataframe).
"""
def evaluate_dataset(
df: pd.DataFrame,
eval_function: Callable,
) -> pd.DataFrame:
"""
Iterate through the dataframe and call the evaluation function on each input.
Returns:
A new dataframe that contains accuracy metrics for each category and label.
"""
Import
# Defined in notebook cell-17; available after running all preceding cells
# No standalone import — run notebook cells in order
# evaluate_dataset is defined in cell-15
# pint_benchmark is defined in cell-17
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| df | pd.DataFrame | Yes | Dataset with columns: "text" (str), "category" (str), "label" (bool) |
| model_name | str | Yes | Display name for the model in results output |
| eval_function | Callable[[str], float] | No | Evaluation callback; defaults to evaluate_lakera_guard |
| quiet | bool | No | Suppress stdout printing; defaults to False |
| weight | Literal["balanced", "imbalanced"] | No | Scoring method; defaults to "balanced" |
Outputs
| Name | Type | Description |
|---|---|---|
| model_name | str | The display name passed in |
| score | float | Balanced (or imbalanced) accuracy as a decimal (e.g. 0.9522 = 95.22%) |
| benchmark | pd.DataFrame | Per-category/label accuracy with columns: accuracy, correct, total. MultiIndex on (category, label). |
| stdout (side effect) | str | When quiet=False, prints formatted table with model name, score, per-category breakdown, and evaluation date |
Usage Examples
Hugging Face Model Evaluation
from benchmark.utils.evaluate_hugging_face_model import HuggingFaceModelEvaluation
model = HuggingFaceModelEvaluation(
model_name="protectai/deberta-v3-base-prompt-injection-v2",
injection_label="INJECTION",
)
model_name, score, results_df = pint_benchmark(
df=df,
eval_function=model.evaluate,
model_name=model.model_name,
weight="balanced",
)
print(f"Score: {round(score * 100, 2)}%")
Custom API Evaluation
def evaluate_my_api(prompt: str) -> bool:
response = requests.post("https://my-api.example.com/detect", json={"text": prompt})
return response.json()["is_injection"]
model_name, score, results_df = pint_benchmark(
df=df,
eval_function=evaluate_my_api,
model_name="My Custom API",
weight="balanced",
)
Quiet Mode for Programmatic Use
# Suppress stdout, just get the return values
_, score, results = pint_benchmark(
df=df,
eval_function=model.evaluate,
model_name="Test Model",
quiet=True,
)
# Compare scores programmatically
if score > 0.90:
print("Model passes threshold")