Implementation:Microsoft LoRA Legacy Run OpenAI GPT
Template:Implementation metadata
Overview
Fine-tuning script for the OpenAI GPT model on the RocStories/Story Cloze task using a double-headed architecture with combined language modeling and multiple choice losses.
Description
run_openai_gpt.py is a legacy HuggingFace Transformers example script included in the Microsoft LoRA NLU example directory. It fine-tunes OpenAIGPTDoubleHeadsModel on the RocStories dataset for the story cloze task, where the model must select the correct ending for a four-sentence story from two candidates. The double-headed model simultaneously optimizes a causal language modeling (LM) loss and a multiple choice (MC) classification loss, combined as loss = lm_coef * lm_loss + mc_loss.
The script handles the complete pipeline: loading and tokenizing the RocStories CSV dataset, encoding stories and continuations with special tokens (_start_, _delimiter_, _classify_), preparing input tensors of shape (n_batch, 2, input_len) for the two candidate endings, training with AdamW and linear warmup scheduling, saving the model with standard HuggingFace conventions, and evaluating accuracy on a held-out test set.
This script is part of the HuggingFace Transformers library (legacy examples) bundled in the Microsoft LoRA repository. It is adapted from the original OpenAI fine-tuning script.
⚠️ DEPRECATED: This file resides in the legacy/ directory and is not actively maintained. Prefer modern equivalents where available.
Usage
Use this script to evaluate or fine-tune the OpenAI GPT model on the RocStories/Story Cloze task. The task tests a model's ability to perform grounded commonsense reasoning about narrative coherence by choosing the correct story ending.
Code Reference
Source Location
| Property | Value |
|---|---|
| File path | examples/NLU/examples/legacy/run_openai_gpt.py
|
| Lines | 320 |
| Module | run_openai_gpt
|
Key Functions
| Name | Signature | Description |
|---|---|---|
load_rocstories_dataset |
load_rocstories_dataset(dataset_path) |
Reads CSV and returns list of (story, continuation1, continuation2, label) tuples
|
pre_process_datasets |
pre_process_datasets(encoded_datasets, input_len, cap_length, start_token, delimiter_token, clf_token) |
Converts encoded datasets to numpy arrays of shape (n_batch, 2, input_len) for input_ids, mc_token_ids, lm_labels, mc_labels
|
accuracy |
accuracy(out, labels) |
Computes number of correct predictions via argmax |
main |
main() |
Entry point: parses CLI args, loads model/tokenizer, trains and evaluates |
CLI Usage
python run_openai_gpt.py \ --model_name openai-gpt \ --do_train \ --do_eval \ --train_dataset "$ROC_STORIES_DIR/cloze_test_val__spring2016 - cloze_test_ALL_val.csv" \ --eval_dataset "$ROC_STORIES_DIR/cloze_test_test__spring2016 - cloze_test_ALL_test.csv" \ --output_dir /path/to/output \ --train_batch_size 16
I/O Contract
Inputs
| Input | Type | Description |
|---|---|---|
--model_name |
str (default "openai-gpt") |
Pretrained model name |
--train_dataset |
str |
Path to RocStories training CSV file |
--eval_dataset |
str |
Path to RocStories evaluation CSV file |
--output_dir |
str (required) |
Output directory for model and results |
--train_batch_size |
int (default 8) |
Training batch size |
--eval_batch_size |
int (default 16) |
Evaluation batch size |
--num_train_epochs |
int (default 3) |
Number of training epochs |
--learning_rate |
float (default 6.25e-5) |
Learning rate for AdamW |
--lm_coef |
float (default 0.9) |
Coefficient weighting the LM loss relative to the MC loss |
--n_valid |
int (default 374) |
Number of validation examples |
--warmup_steps |
int (default 0) |
Linear warmup steps |
Outputs
| Output | Type | Description |
|---|---|---|
pytorch_model.bin |
binary | Saved model state dict |
config.json |
JSON | Model configuration |
| Vocabulary files | files | Tokenizer vocabulary saved to output_dir
|
eval_results.txt |
text file | Evaluation loss, accuracy, and training loss |
RocStories CSV Format
The input CSV files follow the RocStories/Story Cloze format:
| Column | Description |
|---|---|
| Columns 1-4 | Four-sentence story |
| Column 5 | First candidate ending |
| Column 6 | Second candidate ending |
| Last column | Correct ending label (1 or 2, converted to 0 or 1) |
Usage Examples
Training and Evaluation
python run_openai_gpt.py \ --model_name openai-gpt \ --do_train \ --do_eval \ --train_dataset /data/rocstories/cloze_test_val__spring2016.csv \ --eval_dataset /data/rocstories/cloze_test_test__spring2016.csv \ --output_dir /output/openai_gpt_rocstories/ \ --train_batch_size 16 \ --eval_batch_size 16 \ --num_train_epochs 3 \ --learning_rate 6.25e-5 \ --lm_coef 0.9 \ --seed 42
Evaluation Only
python run_openai_gpt.py \ --model_name openai-gpt \ --do_eval \ --eval_dataset /data/rocstories/cloze_test_test__spring2016.csv \ --output_dir /output/openai_gpt_rocstories/