Implementation:Microsoft BIPIA FewShotChatGPT35Defense
| Field | Value |
|---|---|
| Sources | Microsoft BIPIA Repository |
| Domains | NLP, Security, Defense |
| Last Updated | 2026-02-14 |
Overview
Concrete tool for applying border string and few-shot in-context learning defenses to GPT-3.5 API models provided by the BIPIA defense module.
Description
FewShotChatGPT35Defense extends the GPT35WOSystem base class to add defense mechanisms against indirect prompt injection attacks. The class is responsible for:
- Maintaining train/test dataset splits -- It accepts a
DatasetDictcontaining both training and testing splits. The training split serves as the pool from which few-shot defense examples are drawn. - Randomly selecting few-shot examples -- At initialization, the class samples
num_examplesentries from the training set (optionally seeded for reproducibility) to serve as in-context demonstrations. - Wrapping external content with configurable border strings -- The
add_border()method inserts border delimiters around untrusted external content within the prompt, using the configured border type (empty, equals, dashes, or code fences). - Constructing few-shot example messages -- The
construct_example()method builds the sequence of example messages that demonstrate correct model behavior (ignoring injected attacks) and caches them for reuse across test prompts. - Overriding
process_fn()-- The overridden method prepends the constructed few-shot defense context to each test prompt before inference, ensuring every query sent to the API includes the defense demonstrations and bordered content.
The class inherits the GPT model's generate() method for actual inference, so the defense is purely a prompt-construction layer that does not modify the underlying API call mechanics.
Usage
Instantiate FewShotChatGPT35Defense with a GPT config path, an accelerator, a DatasetDict containing train/test splits, the desired number of few-shot examples, and a border type. Then call construct_example() to build the few-shot context from the training set. Finally, use process_fn() to augment individual test examples with the defense context and generate() to run defended inference.
Code Reference
| Property | Value |
|---|---|
| Source | BIPIA repository |
| File | defense/black_box/few_shot.py
|
| Lines | L160-280 |
Signatures
FewShotChatGPT35Defense.__init__(
self,
config,
accelerator,
dataset,
num_examples: int = 1,
seed: int = None,
border_type: str = ""
)
FewShotChatGPT35Defense.add_border(
self,
user_prompt: str,
context: str
) -> str
FewShotChatGPT35Defense.construct_example(
self,
prompt_construct_fn: Callable,
response_construct_fn: Callable
) -> None
FewShotChatGPT35Defense.process_fn(
self,
example: dict,
prompt_construct_fn: Callable
) -> dict
Import
from defense.black_box.few_shot import FewShotChatGPT35Defense
Alternatively, the class is instantiated within the defense script and does not need to be imported directly by end users.
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
config |
str |
Yes | Path to a GPT configuration YAML file. |
accelerator |
Accelerator |
Yes | HuggingFace Accelerate instance for distributed setup. |
dataset |
DatasetDict |
Yes | Dataset with train and test splits.
|
num_examples |
int |
No (default: 1) | Number of few-shot examples to sample from the training set. |
seed |
int |
No | Random seed for reproducible example selection. |
border_type |
str |
No (default: "empty") |
Border delimiter type. One of "empty", "=", "-", or "code".
|
Outputs
Initialization produces a fully configured defense wrapper. Calling process_fn() returns a modified example dict whose message field has been augmented with few-shot defense context and bordered external content, ready for submission to the GPT-3.5 API via generate().
Usage Examples
from datasets import DatasetDict, Dataset
from defense.black_box.few_shot import FewShotChatGPT35Defense
# Assume train_data and test_data are lists of dicts with prompt/context/response keys
dataset = DatasetDict({
"train": Dataset.from_list(train_data),
"test": Dataset.from_list(test_data),
})
# Initialize the defense with 3 few-shot examples and equals-sign borders
defense = FewShotChatGPT35Defense(
config="configs/gpt35.yaml",
accelerator=accelerator,
dataset=dataset,
num_examples=3,
seed=42,
border_type="="
)
# Build the few-shot defense context from the training split
defense.construct_example(
prompt_construct_fn=my_prompt_builder,
response_construct_fn=my_response_builder,
)
# Apply the defense to every example in the test split
defended_test = dataset["test"].map(
lambda example: defense.process_fn(example, prompt_construct_fn=my_prompt_builder)
)
# Run defended inference
results = defense.generate(defended_test)