Implementation:Microsoft LoRA Legacy Run OpenAI GPT

Overview

Fine-tuning script for the OpenAI GPT model on the RocStories/Story Cloze task using a double-headed architecture with combined language modeling and multiple choice losses.

Description

run_openai_gpt.py is a legacy HuggingFace Transformers example script included in the Microsoft LoRA NLU example directory. It fine-tunes OpenAIGPTDoubleHeadsModel on the RocStories dataset for the story cloze task, where the model must select the correct ending for a four-sentence story from two candidates. The double-headed model simultaneously optimizes a causal language modeling (LM) loss and a multiple choice (MC) classification loss, combined as loss = lm_coef * lm_loss + mc_loss.

The script handles the complete pipeline: loading and tokenizing the RocStories CSV dataset, encoding stories and continuations with special tokens (_start_, _delimiter_, _classify_), preparing input tensors of shape (n_batch, 2, input_len) for the two candidate endings, training with AdamW and linear warmup scheduling, saving the model with standard HuggingFace conventions, and evaluating accuracy on a held-out test set.

This script is part of the HuggingFace Transformers library (legacy examples) bundled in the Microsoft LoRA repository. It is adapted from the original OpenAI fine-tuning script.

⚠️ DEPRECATED: This file resides in the legacy/ directory and is not actively maintained. Prefer modern equivalents where available.

Usage

Use this script to evaluate or fine-tune the OpenAI GPT model on the RocStories/Story Cloze task. The task tests a model's ability to perform grounded commonsense reasoning about narrative coherence by choosing the correct story ending.

Code Reference

Source Location

Property	Value
File path	`examples/NLU/examples/legacy/run_openai_gpt.py`
Lines	320
Module	`run_openai_gpt`

Key Functions

Name	Signature	Description
`load_rocstories_dataset`	`load_rocstories_dataset(dataset_path)`	Reads CSV and returns list of `(story, continuation1, continuation2, label)` tuples
`pre_process_datasets`	`pre_process_datasets(encoded_datasets, input_len, cap_length, start_token, delimiter_token, clf_token)`	Converts encoded datasets to numpy arrays of shape `(n_batch, 2, input_len)` for `input_ids`, `mc_token_ids`, `lm_labels`, `mc_labels`
`accuracy`	`accuracy(out, labels)`	Computes number of correct predictions via argmax
`main`	`main()`	Entry point: parses CLI args, loads model/tokenizer, trains and evaluates

CLI Usage

python run_openai_gpt.py \
  --model_name openai-gpt \
  --do_train \
  --do_eval \
  --train_dataset "$ROC_STORIES_DIR/cloze_test_val__spring2016 - cloze_test_ALL_val.csv" \
  --eval_dataset "$ROC_STORIES_DIR/cloze_test_test__spring2016 - cloze_test_ALL_test.csv" \
  --output_dir /path/to/output \
  --train_batch_size 16

I/O Contract

Inputs

Input	Type	Description
`--model_name`	`str` (default `"openai-gpt"`)	Pretrained model name
`--train_dataset`	`str`	Path to RocStories training CSV file
`--eval_dataset`	`str`	Path to RocStories evaluation CSV file
`--output_dir`	`str` (required)	Output directory for model and results
`--train_batch_size`	`int` (default 8)	Training batch size
`--eval_batch_size`	`int` (default 16)	Evaluation batch size
`--num_train_epochs`	`int` (default 3)	Number of training epochs
`--learning_rate`	`float` (default 6.25e-5)	Learning rate for AdamW
`--lm_coef`	`float` (default 0.9)	Coefficient weighting the LM loss relative to the MC loss
`--n_valid`	`int` (default 374)	Number of validation examples
`--warmup_steps`	`int` (default 0)	Linear warmup steps

Outputs

Output	Type	Description
`pytorch_model.bin`	binary	Saved model state dict
`config.json`	JSON	Model configuration
Vocabulary files	files	Tokenizer vocabulary saved to `output_dir`
`eval_results.txt`	text file	Evaluation loss, accuracy, and training loss

RocStories CSV Format

The input CSV files follow the RocStories/Story Cloze format:

Column	Description
Columns 1-4	Four-sentence story
Column 5	First candidate ending
Column 6	Second candidate ending
Last column	Correct ending label (1 or 2, converted to 0 or 1)

Usage Examples

Training and Evaluation

python run_openai_gpt.py \
  --model_name openai-gpt \
  --do_train \
  --do_eval \
  --train_dataset /data/rocstories/cloze_test_val__spring2016.csv \
  --eval_dataset /data/rocstories/cloze_test_test__spring2016.csv \
  --output_dir /output/openai_gpt_rocstories/ \
  --train_batch_size 16 \
  --eval_batch_size 16 \
  --num_train_epochs 3 \
  --learning_rate 6.25e-5 \
  --lm_coef 0.9 \
  --seed 42

Evaluation Only

python run_openai_gpt.py \
  --model_name openai-gpt \
  --do_eval \
  --eval_dataset /data/rocstories/cloze_test_test__spring2016.csv \
  --output_dir /output/openai_gpt_rocstories/

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment