Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft LoRA Run GLUE No Trainer

From Leeroopedia


Template:Implementation metadata

Overview

run_glue_no_trainer.py is a GLUE benchmark fine-tuning script that uses HuggingFace Accelerate instead of the Trainer API, implementing a manual training loop with explicit optimizer, scheduler, and gradient accumulation control.

Description

This script demonstrates how to fine-tune AutoModelForSequenceClassification on GLUE tasks without relying on the Trainer abstraction. It uses the Accelerator class from HuggingFace Accelerate for device placement, distributed training, and mixed precision support.

Key implementation details:

  • Task mapping: Defines task_to_keys mapping GLUE task names to their sentence column names:
    • Single-sentence: cola ("sentence"), sst2 ("sentence")
    • Sentence-pair: mnli ("premise", "hypothesis"), mrpc ("sentence1", "sentence2"), qnli ("question", "sentence"), qqp ("question1", "question2"), rte ("sentence1", "sentence2"), stsb ("sentence1", "sentence2"), wnli ("sentence1", "sentence2")
  • Manual training loop: Implements explicit forward pass, loss computation, gradient accumulation (loss / gradient_accumulation_steps), backward pass via accelerator.backward(), optimizer step, and learning rate scheduler step.
  • Optimizer setup: Uses AdamW with parameter group splitting - weight decay applied to all parameters except bias and LayerNorm weights.
  • Learning rate scheduling: Supports multiple scheduler types via get_scheduler(): linear, cosine, cosine_with_restarts, polynomial, constant, constant_with_warmup.
  • Argument parsing: Uses standard argparse instead of HfArgumentParser, with explicit parameter definitions for batch size, learning rate, epochs, etc.
  • Evaluation: Per-epoch evaluation using GLUE task-specific metrics. Predictions gathered across processes via accelerator.gather().
  • MNLI special handling: After training, performs additional evaluation on the mismatched validation set.
  • Model saving: Uses accelerator.unwrap_model() and accelerator.save() for distributed-safe model saving.

The STSB task is handled as regression (num_labels=1); all others are classification.

Usage

Use this script when you need to:

  • Fine-tune on GLUE benchmarks with full control over the training loop
  • Customize gradient accumulation, optimizer groups, or learning rate scheduling
  • Use HuggingFace Accelerate for distributed training without the Trainer abstraction
  • Prototype or debug training behavior with explicit step-by-step control

Code Reference

Source Location

Property Value
File examples/NLU/examples/text-classification/run_glue_no_trainer.py
Lines 441
Module run_glue_no_trainer
Entry Point main()

Signature/CLI

python run_glue_no_trainer.py \
    --model_name_or_path MODEL_NAME \
    --task_name TASK_NAME \
    [--train_file TRAIN_FILE] \
    [--validation_file VALIDATION_FILE] \
    [--max_length 128] \
    [--pad_to_max_length] \
    [--per_device_train_batch_size 8] \
    [--per_device_eval_batch_size 8] \
    [--learning_rate 5e-5] \
    [--weight_decay 0.0] \
    [--num_train_epochs 3] \
    [--max_train_steps NUM_STEPS] \
    [--gradient_accumulation_steps 1] \
    [--lr_scheduler_type linear] \
    [--num_warmup_steps 0] \
    [--output_dir OUTPUT_DIR] \
    [--seed SEED]

Import

from accelerate import Accelerator
from transformers import (
    AdamW,
    AutoConfig,
    AutoModelForSequenceClassification,
    AutoTokenizer,
    DataCollatorWithPadding,
    PretrainedConfig,
    SchedulerType,
    default_data_collator,
    get_scheduler,
    set_seed,
)
from datasets import load_dataset, load_metric
from torch.utils.data.dataloader import DataLoader

I/O Contract

Inputs

Parameter Type Required Default Description
--model_name_or_path str Yes - Pretrained model name or path
--task_name str No None GLUE task: cola, mnli, mrpc, qnli, qqp, rte, sst2, stsb, wnli
--train_file str No None Custom CSV/JSON training file
--validation_file str No None Custom CSV/JSON validation file
--max_length int No 128 Max tokenized sequence length
--per_device_train_batch_size int No 8 Training batch size per device
--per_device_eval_batch_size int No 8 Evaluation batch size per device
--learning_rate float No 5e-5 Peak learning rate
--weight_decay float No 0.0 Weight decay for non-bias/LayerNorm parameters
--num_train_epochs int No 3 Number of training epochs
--max_train_steps int No None Max training steps (overrides epochs)
--gradient_accumulation_steps int No 1 Steps to accumulate before optimizer step
--lr_scheduler_type SchedulerType No linear Learning rate scheduler type
--num_warmup_steps int No 0 Warmup steps for learning rate scheduler

Outputs

Output Location Description
Trained model {output_dir}/ Saved model weights via save_pretrained()
Epoch metrics stdout/logs Per-epoch evaluation metrics logged to console
MNLI-mm metrics stdout/logs Mismatched validation metrics (MNLI task only)

Usage Examples

Fine-tune on SST-2

python examples/NLU/examples/text-classification/run_glue_no_trainer.py \
    --model_name_or_path bert-base-uncased \
    --task_name sst2 \
    --per_device_train_batch_size 32 \
    --learning_rate 2e-5 \
    --num_train_epochs 3 \
    --output_dir /tmp/sst2_no_trainer

Fine-tune on MNLI with gradient accumulation

python examples/NLU/examples/text-classification/run_glue_no_trainer.py \
    --model_name_or_path roberta-base \
    --task_name mnli \
    --per_device_train_batch_size 16 \
    --gradient_accumulation_steps 2 \
    --learning_rate 1e-5 \
    --lr_scheduler_type cosine \
    --num_warmup_steps 500 \
    --num_train_epochs 3 \
    --output_dir /tmp/mnli_no_trainer

Distributed training with Accelerate

accelerate launch examples/NLU/examples/text-classification/run_glue_no_trainer.py \
    --model_name_or_path bert-base-uncased \
    --task_name mrpc \
    --per_device_train_batch_size 16 \
    --learning_rate 5e-5 \
    --num_train_epochs 5 \
    --output_dir /tmp/mrpc_distributed

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment