Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Whisper English Spelling Mappings

From Leeroopedia
Knowledge Sources
Domains NLP, Text_Normalization
Last Updated 2026-02-13 22:00 GMT

Overview

Data file containing British-to-American English spelling mappings used by EnglishSpellingNormalizer for transcript text standardization.

Description

The english.json file is a JSON dictionary with approximately 1741 entries mapping British English spellings to their American English equivalents. Mappings cover systematic spelling differences including "-ise"/"-ize" (e.g., "organise" → "organize"), "-our"/"-or" (e.g., "colour" → "color"), "-re"/"-er" (e.g., "centre" → "center"), "-ogue"/"-og" (e.g., "catalogue" → "catalog"), and doubled/single consonant variants.

This data file is loaded by the EnglishSpellingNormalizer class to standardize spelling variations before computing Word Error Rate (WER) metrics, ensuring that British and American English transcriptions are treated as equivalent.

Usage

This file is consumed automatically by EnglishSpellingNormalizer at initialization time. It should not typically be loaded directly by users; instead, use EnglishTextNormalizer or EnglishSpellingNormalizer which load the mapping internally.

Code Reference

Source Location

Schema

{
    "<british_spelling>": "<american_spelling>",
    "accessorise": "accessorize",
    "colour": "color",
    "centre": "center"
}

Import

import json
import os

mapping_path = os.path.join(os.path.dirname(__file__), "english.json")
mapping = json.load(open(mapping_path))

I/O Contract

Inputs

Name Type Required Description
file_path str Yes Path to english.json (typically resolved relative to the normalizers package)

Outputs

Name Type Description
mapping Dict[str, str] Dictionary mapping British spellings (keys) to American spellings (values)

Usage Examples

Direct Loading

import json

with open("whisper/normalizers/english.json") as f:
    mapping = json.load(f)

# Look up a British spelling
american = mapping.get("colour", "colour")
print(american)  # "color"

Via EnglishSpellingNormalizer

from whisper.normalizers import EnglishTextNormalizer

normalizer = EnglishTextNormalizer()

# British spellings are automatically converted
text = normalizer("The colour of the centre was analysed")
# Output: "the color of the center was analyzed"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment