Implementation:Microsoft DeepSpeedExamples GLUE Classifier BERT Base

Knowledge Sources	Microsoft_DeepSpeedExamples
Domains	NLP, Fine-tuning
Last Updated	2026-02-07 12:00 GMT

Overview

BERT-base fine-tuning script for all nine GLUE benchmark tasks with DeepSpeed distributed training integration.

Description

run_glue_classifier_bert_base.py is a comprehensive fine-tuning script that trains BERT-base models on the GLUE benchmark (General Language Understanding Evaluation) using DeepSpeed for distributed training. The script implements data processors for all nine GLUE tasks: MRPC (paraphrase detection), MNLI (natural language inference), MNLI-MM (mismatched), CoLA (linguistic acceptability), SST-2 (sentiment analysis), STS-B (semantic similarity), QQP (question pair similarity), QNLI (question NLI), RTE (recognizing textual entailment), and WNLI (Winograd NLI).

Each task processor inherits from DataProcessor and implements methods to read TSV data files, create InputExample objects, and define the label set. The convert_examples_to_features() function tokenizes text pairs using BertTokenizer, pads sequences to max_seq_length, and produces InputFeatures with input IDs, attention masks, segment IDs, and label IDs. Task-specific metrics are computed via compute_metrics(), which dispatches to accuracy, F1, Matthews correlation, or Pearson/Spearman correlation depending on the task.

The training loop uses DeepSpeed for initialization and distributed training, with BertAdam optimizer and linear warmup scheduling. The script supports FocalLoss as an alternative to CrossEntropyLoss for handling class imbalance. It integrates with the pytorch_pretrained_bert library for model architecture and tokenization.

Usage

Use this script to fine-tune BERT-base on any GLUE benchmark task with DeepSpeed distributed training. It is the primary entry point for running BERT-base GLUE experiments in the BingBertGlue training example.

Code Reference

Source Location

Repository: Microsoft_DeepSpeedExamples
File: training/BingBertGlue/run_glue_classifier_bert_base.py
Lines: 1-1145

Signature

class InputExample(object):
    def __init__(self, guid, text_a, text_b=None, label=None):
        ...

class InputFeatures(object):
    def __init__(self, input_ids, input_mask, segment_ids, label_id):
        ...

class DataProcessor(object):
    def get_train_examples(self, data_dir): ...
    def get_dev_examples(self, data_dir): ...
    def get_labels(self): ...

class MrpcProcessor(DataProcessor): ...
class MnliProcessor(DataProcessor): ...
class ColaProcessor(DataProcessor): ...
class Sst2Processor(DataProcessor): ...
class StsbProcessor(DataProcessor): ...
class QqpProcessor(DataProcessor): ...
class QnliProcessor(DataProcessor): ...
class RteProcessor(DataProcessor): ...
class WnliProcessor(DataProcessor): ...

def convert_examples_to_features(examples, label_list, max_seq_length, tokenizer, output_mode):
    ...

def compute_metrics(task_name, preds, labels):
    ...

def main():
    ...

Import

# This is a standalone training script, run via DeepSpeed launcher:
# deepspeed run_glue_classifier_bert_base.py --deepspeed_config ds_config.json ...

I/O Contract

Inputs

Name	Type	Required	Description
--data_dir	str	Yes	Directory containing GLUE task TSV data files
--bert_model	str	Yes	Pretrained BERT model name (e.g., bert-base-uncased)
--task_name	str	Yes	GLUE task name: mrpc, mnli, cola, sst-2, sts-b, qqp, qnli, rte, wnli
--output_dir	str	Yes	Directory for model predictions and checkpoints
--max_seq_length	int	No	Maximum tokenized sequence length (default: 128)
--do_train	flag	No	Run training phase
--do_eval	flag	No	Run evaluation phase
--train_batch_size	int	No	Training batch size (default: 32)
--learning_rate	float	No	Initial learning rate for Adam (default: 5e-5)
--num_train_epochs	float	No	Number of training epochs (default: 3.0)
--local_rank	int	No	Local rank for distributed training (default: -1)

Outputs

Name	Type	Description
eval_results.txt	file	Evaluation metrics (accuracy, F1, MCC, or correlation depending on task)
model checkpoint	directory	Saved model weights and config in output_dir
training logs	stdout	Training loss and evaluation results

Usage Examples

Fine-tune BERT-base on MRPC

# Launch with DeepSpeed for MRPC task
deepspeed run_glue_classifier_bert_base.py \
    --deepspeed_config ds_config.json \
    --data_dir /data/glue/MRPC \
    --bert_model bert-base-uncased \
    --task_name mrpc \
    --output_dir /output/mrpc \
    --do_train \
    --do_eval \
    --do_lower_case \
    --max_seq_length 128 \
    --train_batch_size 32 \
    --learning_rate 2e-5 \
    --num_train_epochs 3.0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment