Implementation:Microsoft LoRA Legacy Run Squad

Overview

Fine-tuning script for extractive question answering on SQuAD 1.1 and SQuAD 2.0 supporting BERT, DistilBERT, XLM, XLNet, and other QA-capable transformer architectures.

Description

run_squad.py is a legacy HuggingFace Transformers example script included in the Microsoft LoRA NLU example directory. It provides a complete training and evaluation pipeline for extractive question answering on the Stanford Question Answering Dataset (SQuAD). The script uses AutoModelForQuestionAnswering with AutoConfig and AutoTokenizer for model-agnostic initialization, and relies on squad_convert_examples_to_features for data preprocessing with distributed-safe caching.

The training loop implements AdamW optimization with linear warmup scheduling, gradient accumulation, mixed-precision training (NVIDIA Apex), multi-GPU via DataParallel, and distributed training via DistributedDataParallel. Evaluation handles both simple models (BERT/DistilBERT producing start/end logits) and complex models (XLNet/XLM producing start/end top-k indices with cls_logits) through separate post-processing paths using compute_predictions_logits and compute_predictions_log_probs. SQuAD 2.0 support includes null answer detection via --version_2_with_negative.

This script is part of the HuggingFace Transformers library (legacy examples) bundled in the Microsoft LoRA repository.

⚠️ DEPRECATED: This file resides in the legacy/ directory and is not actively maintained. Prefer modern equivalents where available.

Usage

Use this script to fine-tune a pretrained transformer model for extractive question answering on SQuAD 1.1 or SQuAD 2.0 datasets, or to evaluate previously fine-tuned checkpoints. It supports resuming training from checkpoints, evaluation of all saved checkpoints, and TensorBoard logging.

Code Reference

Source Location

Property	Value
File path	`examples/NLU/examples/legacy/question-answering/run_squad.py`
Lines	830
Module	`run_squad`

Key Functions

Name	Signature	Description
`train`	`train(args, train_dataset, model, tokenizer)`	Full training loop with DDP, gradient accumulation, TensorBoard, checkpointing; returns `(global_step, avg_loss)`
`evaluate`	`evaluate(args, model, tokenizer, prefix="")`	Runs evaluation, computes predictions, returns F1/exact match dict
`load_and_cache_examples`	`load_and_cache_examples(args, tokenizer, evaluate=False, output_examples=False)`	Loads SQuAD data with caching; uses `SquadV1Processor` or `SquadV2Processor`
`set_seed`	`set_seed(args)`	Seeds random, numpy, and torch RNGs
`main`	`main()`	Entry point: parses args, initializes model, runs train/eval pipeline

CLI Usage

python run_squad.py \
  --model_type bert \
  --model_name_or_path bert-base-uncased \
  --do_train \
  --do_eval \
  --data_dir /path/to/squad \
  --train_file train-v1.1.json \
  --predict_file dev-v1.1.json \
  --output_dir /path/to/output \
  --per_gpu_train_batch_size 8 \
  --learning_rate 3e-5 \
  --num_train_epochs 2.0 \
  --max_seq_length 384 \
  --doc_stride 128

I/O Contract

Inputs

Input	Type	Description
`--model_type`	`str` (required)	Model architecture type (e.g., `bert`, `xlnet`, `distilbert`, `xlm`)
`--model_name_or_path`	`str` (required)	Pretrained model name or path
`--output_dir`	`str` (required)	Directory for checkpoints and predictions
`--data_dir`	`str`	Directory containing SQuAD JSON files
`--train_file`	`str`	Training file name (e.g., `train-v1.1.json`)
`--predict_file`	`str`	Evaluation file name (e.g., `dev-v1.1.json`)
`--version_2_with_negative`	flag	Enable SQuAD 2.0 mode with unanswerable questions
`--max_seq_length`	`int` (default 384)	Maximum input sequence length after tokenization
`--doc_stride`	`int` (default 128)	Stride for splitting long documents into chunks
`--max_query_length`	`int` (default 64)	Maximum number of tokens for the question
`--per_gpu_train_batch_size`	`int` (default 8)	Batch size per GPU for training
`--learning_rate`	`float` (default 5e-5)	Initial learning rate for AdamW
`--num_train_epochs`	`float` (default 3.0)	Total training epochs

Outputs

Output	Type	Description
`predictions_{prefix}.json`	JSON	Best predicted answer span for each question
`nbest_predictions_{prefix}.json`	JSON	Top-N predicted answer spans with probabilities
`null_odds_{prefix}.json`	JSON	Null answer odds (SQuAD 2.0 only)
`checkpoint-{step}/`	directory	Saved model, tokenizer, optimizer, scheduler states
`training_args.bin`	binary	Serialized training arguments
Return value	`Dict[str, float]`	Evaluation results with F1 and exact match scores

Usage Examples

Fine-tuning BERT on SQuAD 1.1

python run_squad.py \
  --model_type bert \
  --model_name_or_path bert-base-uncased \
  --do_train \
  --do_eval \
  --data_dir /data/squad/ \
  --train_file train-v1.1.json \
  --predict_file dev-v1.1.json \
  --output_dir /output/squad_bert/ \
  --per_gpu_train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 2.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --overwrite_output_dir

Fine-tuning XLNet on SQuAD 2.0

python run_squad.py \
  --model_type xlnet \
  --model_name_or_path xlnet-large-cased \
  --do_train \
  --do_eval \
  --version_2_with_negative \
  --data_dir /data/squad/ \
  --train_file train-v2.0.json \
  --predict_file dev-v2.0.json \
  --output_dir /output/squad2_xlnet/ \
  --per_gpu_train_batch_size 4 \
  --learning_rate 3e-5 \
  --num_train_epochs 4.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --overwrite_output_dir

Evaluating All Checkpoints

python run_squad.py \
  --model_type bert \
  --model_name_or_path /output/squad_bert/ \
  --do_eval \
  --eval_all_checkpoints \
  --data_dir /data/squad/ \
  --predict_file dev-v1.1.json \
  --output_dir /output/squad_bert/ \
  --max_seq_length 384

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment