Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Pytorch Serve NMT Translation Handler

From Leeroopedia

Overview

LanguageTranslationHandler is a TorchServe handler for neural machine translation using fairseq's TransformerModel. It extends BaseHandler and provides beam search translation with configurable BPE settings loaded from setup_config.json. The handler accepts text input, translates using beam search with beam size 5, and returns JSON output pairing the original input with its translation.

Field Value
Implementation Name NMT_Translation_Handler
Type Example Handler
Workflow Neural_Machine_Translation
Domains NLP, Machine_Translation
Knowledge Sources Pytorch_Serve
Last Updated 2026-02-13 18:52 GMT

Description

The LanguageTranslationHandler class implements the full inference lifecycle for sequence-to-sequence neural machine translation. During initialization, it reads BPE configuration from setup_config.json in the model directory, loads a fairseq TransformerModel with Moses tokenizer, and moves the model to the appropriate device. The translation uses beam search with a fixed beam width of 5.

Key Responsibilities

  • Configuration Loading: Reads setup_config.json for BPE settings and translated output field name
  • Model Loading: Loads fairseq TransformerModel.from_pretrained() with Moses tokenizer and configured BPE
  • Text Preprocessing: Extracts text from request data and decodes bytes to UTF-8
  • Beam Search Translation: Calls model.translate() with beam=5 under torch.no_grad()
  • JSON Output: Returns list of JSON strings with input text and translated output, using configurable output field name

Usage

from model_handler_generalized import LanguageTranslationHandler

The handler requires a setup_config.json in the model directory:

{
    "bpe": "fastbpe",
    "translated_output": "french_output"
}

Code Reference

Source Location

File Lines Description
examples/nmt_transformer/model_handler_generalized.py L1-74 Full handler module (73 lines)
examples/nmt_transformer/model_handler_generalized.py L10-74 LanguageTranslationHandler class definition
examples/nmt_transformer/model_handler_generalized.py L18-49 initialize(context) -- config loading, fairseq model setup
examples/nmt_transformer/model_handler_generalized.py L51-57 preprocess(data) -- text extraction and UTF-8 decoding
examples/nmt_transformer/model_handler_generalized.py L59-70 inference(data) -- beam search translation with JSON formatting
examples/nmt_transformer/model_handler_generalized.py L72-73 postprocess(data) -- identity passthrough

Signature

class LanguageTranslationHandler(BaseHandler):

    def __init__(self):
        self._context = None
        self.initialized = False
        self.model = None
        self.device = None

    def initialize(self, context):
        """
        Load fairseq TransformerModel with Moses tokenizer and BPE config.

        Reads setup_config.json from model_dir for BPE settings.
        Loads TransformerModel.from_pretrained() with checkpoint file
        'model.pt' and Moses tokenizer.

        Args:
            context: TorchServe context with system_properties and manifest.
        """
        ...

    def preprocess(self, data):
        """
        Extract and decode text inputs from request data.

        Args:
            data (list): List of dicts with "data" or "body" keys
                         containing bytes-encoded text.

        Returns:
            list[str]: List of decoded UTF-8 text strings.
        """
        ...

    def inference(self, data, *args, **kwargs):
        """
        Translate input texts using beam search.

        Calls model.translate() with beam=5 under torch.no_grad().
        Returns JSON strings pairing input text with translation.

        Args:
            data (list[str]): List of source language text strings.

        Returns:
            list[str]: List of JSON strings with input and translation.
        """
        ...

    def postprocess(self, data):
        """
        Return inference output unchanged.

        Args:
            data (list): JSON string list from inference.

        Returns:
            list: Same as input.
        """
        ...

Import

# Handler imports
from ts.torch_handler.base_handler import BaseHandler
from fairseq.models.transformer import TransformerModel
import torch
import json
import os

I/O Contract

Method Input Output Notes
initialize(context) Context with system_properties["model_dir"] containing setup_config.json and model.pt None (sets self.model, self.setup_config, self.initialized = True) Warns if setup_config.json is missing
preprocess(data) list[dict] with "data"/"body" containing bytes list[str] -- UTF-8 decoded text Calls .decode('utf-8') on each input
inference(data) list[str] -- source language texts list[str] -- JSON strings with input/translation pairs Uses beam=5; output key from setup_config["translated_output"]
postprocess(data) list[str] from inference list[str] Identity passthrough

Request/Response Format

// Request (single text input)
{
    "data": "Hello, how are you?"
}

// Response (JSON string)
{
    "input": "Hello, how are you?",
    "french_output": "Bonjour, comment allez-vous?"
}

Usage Examples

Example 1: Initialization with BPE Configuration

# From model_handler_generalized.py L18-49: initialize() loads config and model
def initialize(self, context):
    self._context = context
    self.initialized = True
    self.manifest = context.manifest

    properties = context.system_properties
    model_dir = properties.get("model_dir")

    self.device = torch.device(
        "cuda:" + str(properties.get("gpu_id"))
        if torch.cuda.is_available() and properties.get("gpu_id") is not None
        else "cpu"
    )

    # Read BPE config from setup_config.json
    setup_config_path = os.path.join(model_dir, "setup_config.json")
    if os.path.isfile(setup_config_path):
        with open(setup_config_path) as setup_config_file:
            self.setup_config = json.load(setup_config_file)
    else:
        logger.warning('Missing the setup_config.json file.')

    # Load fairseq TransformerModel with Moses tokenizer
    self.model = TransformerModel.from_pretrained(
        model_dir,
        checkpoint_file='model.pt',
        data_name_or_path=model_dir,
        tokenizer='moses',
        bpe=self.setup_config["bpe"]
    )
    self.model.to(self.device)
    self.model.eval()
    self.initialized = True

Example 2: Beam Search Inference

# From model_handler_generalized.py L59-70: inference() with beam=5
def inference(self, data, *args, **kwargs):
    inference_output = []
    with torch.no_grad():
        translation = self.model.translate(data, beam=5)
    for i in range(0, len(data)):
        output = {
            "input": data[i],
            self.setup_config["translated_output"]: translation[i]
        }
        inference_output.append(json.dumps(output))
    return inference_output

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment