Implementation:Microsoft LoRA Legacy Run NER

Overview

Fine-tuning script for named entity recognition (NER) on CoNLL-2003 formatted datasets using AutoModelForTokenClassification with seqeval-based evaluation metrics.

Description

run_ner.py is a legacy HuggingFace Transformers example script included in the Microsoft LoRA NLU example directory. It fine-tunes any model compatible with AutoModelForTokenClassification for token-level classification tasks such as named entity recognition (NER) and part-of-speech (POS) tagging on CoNLL-2003 formatted data.

The script uses HfArgumentParser with three dataclasses: ModelArguments (model path, config, tokenizer, task type), DataTrainingArguments (data directory, labels file, max sequence length), and the built-in TrainingArguments. Task types are dynamically loaded from a tasks module via importlib, allowing extensibility to custom token classification tasks beyond NER.

Evaluation uses the seqeval library to compute entity-level precision, recall, F1-score, and token-level accuracy. The align_predictions function maps predicted and true label indices back to string labels, filtering out padding tokens (those with CrossEntropyLoss().ignore_index). The script supports train, evaluate, and predict phases, with predictions written to test_predictions.txt in the original CoNLL format.

The script also supports JSON configuration files as an alternative to command-line arguments, and includes a TPU spawn entry point for distributed training on TPU pods.

This script is part of the HuggingFace Transformers library (legacy examples) bundled in the Microsoft LoRA repository.

⚠️ DEPRECATED: This file resides in the legacy/ directory and is not actively maintained. Prefer modern equivalents where available.

Usage

Use this script to fine-tune a pretrained transformer model for named entity recognition on CoNLL-2003 or similarly formatted datasets. It supports any token classification task that can be defined as a TokenClassificationTask subclass.

Code Reference

Source Location

Property	Value
File path	`examples/NLU/examples/legacy/token-classification/run_ner.py`
Lines	321
Module	`run_ner`

Key Classes and Functions

Name	Type	Signature / Description
`ModelArguments`	dataclass	Fields: `model_name_or_path`, `config_name`, `task_type` (default `"NER"`), `tokenizer_name`, `use_fast`, `cache_dir`
`DataTrainingArguments`	dataclass	Fields: `data_dir`, `labels`, `max_seq_length` (default 128), `overwrite_cache`
`align_predictions`	function (nested)	`align_predictions(predictions: np.ndarray, label_ids: np.ndarray) -> Tuple[List[int], List[int]]` -- maps predictions and labels to string label names, excluding padding
`compute_metrics`	function (nested)	`compute_metrics(p: EvalPrediction) -> Dict` -- computes seqeval accuracy, precision, recall, F1
`main`	function	Entry point: parses args, loads task class, builds model/tokenizer/datasets/trainer, runs train/eval/predict
`_mp_fn`	function	TPU spawn entry point

CLI Usage

python run_ner.py \
  --model_name_or_path bert-base-cased \
  --data_dir /path/to/conll2003 \
  --output_dir /path/to/output \
  --do_train \
  --do_eval \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 16 \
  --num_train_epochs 3

I/O Contract

Inputs

Input	Type	Description
`--model_name_or_path`	`str` (required)	Pretrained model name or path
`--data_dir`	`str` (required)	Directory with CoNLL-2003 formatted `train.txt`, `dev.txt`, `test.txt`
`--labels`	`Optional[str]`	Path to labels file; if not specified, CoNLL-2003 labels are used
`--task_type`	`str` (default `"NER"`)	Task class name to import from `tasks` module (e.g., `"NER"`, `"POS"`)
`--max_seq_length`	`int` (default 128)	Maximum input sequence length after tokenization
`--use_fast`	flag	Use fast tokenizer
`--overwrite_cache`	flag	Overwrite cached dataset files
Standard `TrainingArguments`	various	`--output_dir`, `--do_train`, `--do_eval`, `--do_predict`, `--per_device_train_batch_size`, `--num_train_epochs`, `--fp16`, etc.

Data Format (CoNLL-2003)

The input text files use CoNLL column format with blank lines separating sentences:

EU B-ORG
rejects O
German B-MISC
call O
to O
boycott O
British B-MISC
lamb O
. O

Peter B-PER
Blackburn I-PER

Outputs

Output	Type	Description
Saved model	directory	Model, tokenizer, and config saved to `output_dir`
`eval_results.txt`	text file	Evaluation metrics (accuracy, precision, recall, F1)
`test_results.txt`	text file	Test set metrics
`test_predictions.txt`	text file	Per-token NER predictions in CoNLL format
Return value	`Dict[str, float]`	Dictionary with evaluation metrics

Evaluation Metrics

Metric	Source	Description
`accuracy_score`	seqeval	Token-level accuracy
`precision`	seqeval	Entity-level precision (strict matching)
`recall`	seqeval	Entity-level recall (strict matching)
`f1`	seqeval	Entity-level F1 score (strict matching)

Usage Examples

Fine-tuning BERT for NER on CoNLL-2003

python run_ner.py \
  --model_name_or_path bert-base-cased \
  --data_dir /data/conll2003/ \
  --labels /data/conll2003/labels.txt \
  --output_dir /output/bert_ner/ \
  --do_train \
  --do_eval \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 16 \
  --per_device_eval_batch_size 16 \
  --num_train_epochs 3 \
  --learning_rate 5e-5 \
  --overwrite_output_dir

NER with JSON Configuration

# config.json:
# {
#   "model_name_or_path": "bert-base-cased",
#   "data_dir": "/data/conll2003/",
#   "output_dir": "/output/bert_ner/",
#   "do_train": true,
#   "do_eval": true,
#   "max_seq_length": 128,
#   "per_device_train_batch_size": 16,
#   "num_train_epochs": 3
# }

python run_ner.py config.json

Fine-tuning for POS Tagging

python run_ner.py \
  --model_name_or_path bert-base-cased \
  --task_type POS \
  --data_dir /data/pos_tagging/ \
  --labels /data/pos_tagging/labels.txt \
  --output_dir /output/bert_pos/ \
  --do_train \
  --do_eval \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --num_train_epochs 5 \
  --overwrite_output_dir

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment