Implementation:Microsoft BIPIA Load Bipia Supervised Data Module

Overview

Concrete tool for constructing supervised finetuning datasets from poisoned prompts and correct responses provided by the BIPIA defense module.

Description

load_bipia_supervised_data_module() iterates over configured task names ("all" or specific tasks joined by "+"), builds PIA datasets using AutoPIABuilder, determines response targets based on response_strategy, constructs conversation tuples (user_prompt, response), and applies tokenization with label masking. It supports concatenating datasets from multiple task types and creates input_ids with IGNORE_TOKEN_ID masking for all tokens before the response.

Usage

Called internally during white-box defense finetuning with tokenizer and data_args.

Code Reference

Field	Value
Source	BIPIA repo
File	defense/white_box/finetune.py
Lines	L261-474
Signature	`def load_bipia_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, data_args) -> Dict`
Import	Internal function in finetune.py

I/O Contract

Inputs

Parameter	Type	Required	Description
tokenizer	transformers.PreTrainedTokenizer	Yes	Tokenizer for the target model
data_args	DataArguments	Yes	Configuration dataclass containing all data parameters

DataArguments fields:

Field	Description
dataset_name	Task type(s): "all" or specific tasks joined by "+" (e.g., "qa+email+code")
response_strategy	One of "original", "self_clean", "gpt4_clean"
context_data_file	Path to context data
attack_data_file	Path to attack data
response_data_file	Path to clean response data (for self_clean/gpt4_clean strategies)
bipia_seed	Random seed for dataset construction
add_ign_guidance	Whether to prepend ignore-attack guidance text to prompts

Outputs

datasets.Dataset with the following columns:

Column	Description
input_ids	Tokenized input sequence (prompt + response concatenated)
attention_mask	Standard attention mask (1 for real tokens, 0 for padding)
labels	Token IDs for loss computation, with IGNORE_TOKEN_ID (-100) masking all tokens before the response start

Usage Examples

Function call during finetuning:

from defense.white_box.finetune import load_bipia_supervised_data_module

data_module = load_bipia_supervised_data_module(
    tokenizer=tokenizer,
    data_args=data_args
)

# data_module is a dict containing the Dataset
# Used directly with HuggingFace Trainer
trainer = Trainer(
    model=model,
    train_dataset=data_module["train_dataset"],
    ...
)

DataArguments configuration example:

@dataclass
class DataArguments:
    dataset_name: str = "all"                      # all 5 task types
    response_strategy: str = "self_clean"           # use model's own clean responses
    context_data_file: str = "data/context/"
    attack_data_file: str = "data/attacks/"
    response_data_file: str = "output/clean_responses/"
    bipia_seed: int = 42
    add_ign_guidance: bool = False

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment