Principle:Lakeraai Pint benchmark HF Model Wrapping

Knowledge Sources	HuggingFace Pipelines SetFit Documentation PINT Benchmark
Domains	NLP, Model_Evaluation, Prompt_Injection
Last Updated	2026-02-14 14:00 GMT

Overview

A design pattern that wraps heterogeneous Hugging Face model architectures behind a unified evaluation interface for prompt injection detection.

Description

When benchmarking prompt injection detection, models from Hugging Face Hub come in diverse architectures (standard sequence classification, SetFit few-shot models) with different tokenization requirements, context length limits, and output formats. HF Model Wrapping solves this fragmentation by encapsulating the model loading, tokenizer initialization, and classification pipeline setup into a single constructor. This allows downstream benchmark code to call a uniform evaluate(prompt) -> bool method regardless of the underlying model type.

The pattern handles three key challenges:

Architecture branching: Standard HuggingFace text-classification pipelines vs. SetFit predictors require different loading and inference paths.
Context length management: Models have varying maximum token lengths. The wrapper auto-detects max_position_embeddings from model config or falls back to a default of 512 tokens.
Long input chunking: Prompts exceeding the model's context window are split into overlapping chunks (25% stride) to ensure prompt injections near chunk boundaries are not missed.

Usage

Use this pattern when you need to evaluate any Hugging Face text-classification or SetFit model against a benchmark dataset. It is the entry point for the Hugging Face Model Evaluation workflow in the PINT Benchmark and should be instantiated once per model before passing its evaluate method to the benchmark runner.

Theoretical Basis

The wrapping pattern follows the Adapter design pattern from object-oriented design:

# Abstract algorithm (NOT real implementation)
wrapper = ModelWrapper(model_id, config_params)
# Internally:
#   1. Load model (HF pipeline OR SetFit)
#   2. Load tokenizer
#   3. Determine max_length from config or default
#   4. Create classifier (pipeline or predict method)

result = wrapper.evaluate(prompt)
# Internally:
#   1. Tokenize with chunking + stride overlap
#   2. Classify each chunk
#   3. Return True if ANY chunk flagged as injection

The chunking strategy uses a 25% overlap stride to prevent false negatives at chunk boundaries:

Failed to parse (syntax error): {\displaystyle \text{stride} = \lfloor \frac{\text{max\_length}}{4} \rfloor }

This ensures that a prompt injection payload spanning two chunks will be fully contained within at least one chunk.

Related Pages

Implemented By

Implementation:Lakeraai_Pint_benchmark_HuggingFaceModelEvaluation_Init

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment