Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft BIPIA Load Bipia Supervised Data Module

From Leeroopedia

Template:Metadata

Overview

Concrete tool for constructing supervised finetuning datasets from poisoned prompts and correct responses provided by the BIPIA defense module.

Description

load_bipia_supervised_data_module() iterates over configured task names ("all" or specific tasks joined by "+"), builds PIA datasets using AutoPIABuilder, determines response targets based on response_strategy, constructs conversation tuples (user_prompt, response), and applies tokenization with label masking. It supports concatenating datasets from multiple task types and creates input_ids with IGNORE_TOKEN_ID masking for all tokens before the response.

Usage

Called internally during white-box defense finetuning with tokenizer and data_args.

Code Reference

Field Value
Source BIPIA repo
File defense/white_box/finetune.py
Lines L261-474
Signature def load_bipia_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, data_args) -> Dict
Import Internal function in finetune.py

I/O Contract

Inputs

Parameter Type Required Description
tokenizer transformers.PreTrainedTokenizer Yes Tokenizer for the target model
data_args DataArguments Yes Configuration dataclass containing all data parameters

DataArguments fields:

Field Description
dataset_name Task type(s): "all" or specific tasks joined by "+" (e.g., "qa+email+code")
response_strategy One of "original", "self_clean", "gpt4_clean"
context_data_file Path to context data
attack_data_file Path to attack data
response_data_file Path to clean response data (for self_clean/gpt4_clean strategies)
bipia_seed Random seed for dataset construction
add_ign_guidance Whether to prepend ignore-attack guidance text to prompts

Outputs

datasets.Dataset with the following columns:

Column Description
input_ids Tokenized input sequence (prompt + response concatenated)
attention_mask Standard attention mask (1 for real tokens, 0 for padding)
labels Token IDs for loss computation, with IGNORE_TOKEN_ID (-100) masking all tokens before the response start

Usage Examples

Function call during finetuning:

from defense.white_box.finetune import load_bipia_supervised_data_module

data_module = load_bipia_supervised_data_module(
    tokenizer=tokenizer,
    data_args=data_args
)

# data_module is a dict containing the Dataset
# Used directly with HuggingFace Trainer
trainer = Trainer(
    model=model,
    train_dataset=data_module["train_dataset"],
    ...
)

DataArguments configuration example:

@dataclass
class DataArguments:
    dataset_name: str = "all"                      # all 5 task types
    response_strategy: str = "self_clean"           # use model's own clean responses
    context_data_file: str = "data/context/"
    attack_data_file: str = "data/attacks/"
    response_data_file: str = "output/clean_responses/"
    bipia_seed: int = 42
    add_ign_guidance: bool = False

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment