Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft LoRA Lightning Base

From Leeroopedia


Template:Implementation metadata

Overview

Base PyTorch Lightning module and training utilities for fine-tuning HuggingFace Transformer models across multiple NLP task types.

Description

lightning_base.py provides BaseTransformer, a pl.LightningModule subclass that serves as the foundation for training HuggingFace Transformer models using the PyTorch Lightning framework. It abstracts model initialization (via AutoConfig, AutoTokenizer, and task-specific AutoModel classes), optimizer configuration (AdamW or Adafactor with grouped weight decay), and learning rate scheduling (linear, cosine, cosine with restarts, or polynomial warmup). The module also includes LoggingCallback for logging learning rates and validation/test metrics, add_generic_args for CLI argument registration, and generic_train for standardized trainer initialization with checkpoint saving, early stopping, and distributed training support.

The module maps task types to AutoModel classes via the MODEL_MODES dictionary, supporting base, sequence-classification, question-answering, pretraining, token-classification, language-modeling, summarization, and translation modes.

This is part of the HuggingFace Transformers legacy examples bundled in the Microsoft LoRA repository.

⚠️ DEPRECATED: This file resides in the legacy/ directory and is not actively maintained. Prefer modern equivalents where available.

Usage

Use this module as a base class when building PyTorch Lightning training scripts for NLP tasks. Subclass BaseTransformer and implement get_dataloader(), training_step(), and validation_step()/validation_end() methods for your specific task. Call generic_train() to launch training with standard arguments.

Code Reference

Source Location

Property Value
File path examples/NLU/examples/legacy/pytorch-lightning/lightning_base.py
Lines 391
Module lightning_base

Key Classes and Functions

Name Type Signature / Description
BaseTransformer class __init__(self, hparams: argparse.Namespace, num_labels=None, mode="base", config=None, tokenizer=None, model=None, **config_kwargs)
BaseTransformer.configure_optimizers method Returns [optimizer], [scheduler] with AdamW or Adafactor and selected LR schedule
BaseTransformer.total_steps method total_steps() -> int -- computes total training steps from dataset size, batch size, accumulation, and epochs
BaseTransformer.on_save_checkpoint method Saves model and tokenizer to output_dir/best_tfmr (rank zero only)
BaseTransformer.add_model_specific_args static method Registers model-specific CLI arguments (model path, LR, scheduler, batch sizes, etc.)
LoggingCallback class pl.Callback that logs LR on batch end and validation/test metrics on epoch end
add_generic_args function add_generic_args(parser, root_dir) -> None -- adds output_dir, fp16, seed, data_dir, etc.
generic_train function generic_train(model, args, early_stopping_callback=None, logger=True, extra_callbacks=[], checkpoint_callback=None, logging_callback=None, **extra_train_kwargs) -- initializes and runs pl.Trainer

MODEL_MODES Dictionary

MODEL_MODES = {
    "base": AutoModel,
    "sequence-classification": AutoModelForSequenceClassification,
    "question-answering": AutoModelForQuestionAnswering,
    "pretraining": AutoModelForPreTraining,
    "token-classification": AutoModelForTokenClassification,
    "language-modeling": AutoModelWithLMHead,
    "summarization": AutoModelForSeq2SeqLM,
    "translation": AutoModelForSeq2SeqLM,
}

LR Scheduler Options

arg_to_scheduler = {
    "linear": get_linear_schedule_with_warmup,
    "cosine": get_cosine_schedule_with_warmup,
    "cosine_w_restarts": get_cosine_with_hard_restarts_schedule_with_warmup,
    "polynomial": get_polynomial_decay_schedule_with_warmup,
}

Import Usage

from lightning_base import BaseTransformer, add_generic_args, generic_train, LoggingCallback

I/O Contract

Inputs

Input Type Description
hparams argparse.Namespace CLI arguments including model_name_or_path, output_dir, learning_rate, lr_scheduler, warmup_steps, train_batch_size, max_epochs, etc.
num_labels Optional[int] Number of output labels (passed to AutoConfig)
mode str Key into MODEL_MODES dict selecting the AutoModel class (default "base")
config Optional[PretrainedConfig] Pre-built config (if None, loaded from model_name_or_path)
tokenizer Optional[PreTrainedTokenizer] Pre-built tokenizer (if None, loaded from model_name_or_path)
model Optional[PreTrainedModel] Pre-built model (if None, loaded from model_name_or_path)

Outputs

Output Type Description
generic_train() return pl.Trainer Configured and (optionally) fitted PyTorch Lightning Trainer
Checkpoint directory directory Model and tokenizer saved to output_dir/best_tfmr/
Test results file test_results.txt Written by LoggingCallback.on_test_end() to output_dir/

Usage Examples

Subclassing BaseTransformer

import argparse
from lightning_base import BaseTransformer, add_generic_args, generic_train

class MyClassifier(BaseTransformer):
    mode = "sequence-classification"

    def __init__(self, hparams):
        super().__init__(hparams, num_labels=2, mode=self.mode)

    def forward(self, **inputs):
        return self.model(**inputs)

    def training_step(self, batch, batch_idx):
        outputs = self(**batch)
        loss = outputs[0]
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        outputs = self(**batch)
        loss = outputs[0]
        return {"val_loss": loss}

    def validation_end(self, outputs):
        avg_loss = torch.stack([x["val_loss"] for x in outputs]).mean()
        return {"val_loss": avg_loss}

    def get_dataloader(self, type_path, batch_size, shuffle=False):
        # Implement dataset loading
        ...

parser = argparse.ArgumentParser()
add_generic_args(parser, ".")
BaseTransformer.add_model_specific_args(parser, ".")
parser.add_argument("--gpus", type=int, default=1)
args = parser.parse_args()

model = MyClassifier(args)
trainer = generic_train(model, args)

CLI Usage

python my_task.py \
  --model_name_or_path bert-base-uncased \
  --data_dir /path/to/data \
  --output_dir /path/to/output \
  --do_train \
  --learning_rate 5e-5 \
  --lr_scheduler linear \
  --warmup_steps 500 \
  --num_train_epochs 3 \
  --train_batch_size 32 \
  --gpus 1

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment