Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Microsoft BIPIA Training Data Tokenization

From Leeroopedia
Field Value
Sources BIPIA paper
Domains NLP, Tokenization, Defense
Last Updated 2026-02-14

Overview

A tokenization methodology that converts supervised defense training data into model-ready token sequences with selective label masking, ensuring the model is only trained to predict response tokens while ignoring prompt tokens.

Description

Training data tokenization converts text conversations into token ID sequences with a critical feature: label masking. The input consists of a Vicuna-format conversation:

[BOS] system_prompt USER: <data>context</data> question ASSISTANT: response [EOS]

The labels tensor mirrors input_ids but replaces all tokens before the response with IGNORE_TOKEN_ID (-100), so the loss function only backpropagates through the response generation tokens. The <data>/</data> special tokens are optionally inserted around the context to mark content boundaries. Sequences exceeding model_max_length are filtered out.

Usage

Use when preparing tokenized training batches for white-box defense finetuning. The label masking ensures the model learns to generate correct responses without memorizing prompt patterns.

Theoretical Basis

Causal language model training uses cross-entropy loss on next-token prediction. Label masking with IGNORE_TOKEN_ID (-100) excludes prompt tokens from the loss computation:

Loss = -sum(log P(t_i | t_{<i}))   only for i in response_positions

This is standard practice in instruction tuning (Alpaca, Vicuna) to prevent the model from being trained to reproduce instructions.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment