Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Microsoft LoRA NLG Eval External Tools

From Leeroopedia


Knowledge Sources
Domains Infrastructure, NLG_Evaluation
Last Updated 2026-02-10 05:30 GMT

Overview

External evaluation tool dependencies (Perl, Java, Python packages) required for computing NLG metrics: BLEU, METEOR, chrF++, TER, and BERTScore.

Description

The NLG evaluation pipeline relies on a mix of external tools beyond standard Python packages. The BLEU metric uses a Perl script (`multi-bleu-detok.perl`), METEOR requires Java 1.8+ with the METEOR JAR, and other metrics use Python packages (pyter for TER, bert_score for BERTScore, nltk for tokenization). These are installed via the `download_evalscript.sh` script which clones the WebNLG GenerationEval and e2e-metrics repositories.

Usage

Use this environment when running the NLG Evaluation Metrics workflow step. It is required by the `eval.py` script which computes BLEU, METEOR, chrF++, TER, BERTScore, and BLEURT on generated text from GPT-2 LoRA models.

System Requirements

Category Requirement Notes
OS Linux Perl and Java must be globally accessible
Runtime Perl 5+ Required for `multi-bleu-detok.perl` BLEU script
Runtime Java 1.8+ Required for METEOR JAR (`meteor-1.5.jar`), needs `-Xmx2G` heap
Disk ~500MB For cloned evaluation repos and METEOR JAR

Dependencies

System Packages

  • `perl` (globally installed)
  • `java` >= 1.8 (globally installed, needs 2GB heap for METEOR)
  • `git` (for cloning evaluation repos)

Python Packages

  • `pyter` (TER computation)
  • `bert_score` (BERTScore computation)
  • `nltk` (tokenization)
  • `razdel` (Russian tokenization)
  • `tabulate` (result formatting)
  • `codecs` (standard library)

External Repositories

Credentials

No credentials required.

Quick Install

# Install Python evaluation packages
pip install pyter bert_score nltk razdel tabulate

# Run the evaluation setup script
cd examples/NLG
bash eval/download_evalscript.sh

Code Evidence

Perl BLEU dependency from `examples/NLG/eval/eval.py:59`:

BLEU_PATH = 'metrics/multi-bleu-detok.perl'

Java METEOR dependency from `examples/NLG/eval/eval.py:60`:

METEOR_PATH = 'metrics/meteor-1.5/meteor-1.5.jar'

Perl invocation from `examples/NLG/eval/eval.py:112`:

command = 'perl {0} {1} < {2}'.format(BLEU_PATH, ' '.join(ref_files), hyps_path)
result = subprocess.check_output(command, shell=True)

Java invocation from `examples/NLG/eval/eval.py:155-156`:

command = 'java -Xmx2G -jar {0} '.format(METEOR_PATH)
command += '{0} {1} -l {2} -norm -r {3}'.format(hyps_tmp, refs_tmp, lng, num_refs)

Error message for missing Perl from `examples/NLG/eval/eval.py:117-118`:

logging.error('ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE PERL INSTALLED GLOBALLY ON YOUR MACHINE.')

Error message for missing Java from `examples/NLG/eval/eval.py:160-161`:

logging.error('ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE JAVA INSTALLED GLOBALLY ON YOUR MACHINE.')

Common Errors

Error Message Cause Solution
`ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE PERL INSTALLED GLOBALLY ON YOUR MACHINE.` Perl not installed or BLEU script missing Install Perl and run `bash eval/download_evalscript.sh`
`ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE JAVA INSTALLED GLOBALLY ON YOUR MACHINE.` Java not installed or METEOR JAR missing Install Java 1.8+ and run `bash eval/download_evalscript.sh`
`ModuleNotFoundError: No module named 'pyter'` pyter package not installed `pip install pyter`
`ModuleNotFoundError: No module named 'bert_score'` bert_score package not installed `pip install bert_score`

Compatibility Notes

  • Russian evaluation: Uses `razdel` tokenizer instead of NLTK when `--language ru` is specified.
  • BLEURT: Only available for English (`lng == 'en'`). Requires a BLEURT checkpoint at `metrics/bleurt/bleurt-base-128`.
  • BERTScore: Falls back to 0 precision/recall/F1 on failure (e.g., if model download fails).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment