Implementation:Pytorch Serve SAM Fast Handler

Overview

SegmentAnythingFastHandler is a TorchServe handler for serving the Segment Anything Fast model for automatic mask generation. It extends BaseHandler and provides image-to-mask inference with COCO RLE output encoding. The handler accepts base64-encoded or raw bytes images, converts them to BGR via OpenCV, generates segmentation masks, and returns pickle-serialized base64-encoded results.

Field	Value
Implementation Name	SAM_Fast_Handler
Type	Example Handler
Workflow	Instance_Segmentation_Serving
Domains	Computer_Vision, Instance_Segmentation
Knowledge Sources	Pytorch_Serve
Last Updated	2026-02-13 18:52 GMT

Description

The SegmentAnythingFastHandler class implements the full inference lifecycle for the Segment Anything Fast model. During initialization, it loads the SAM model from a checkpoint using the sam_model_fast_registry, configures the SamAutomaticMaskGenerator with COCO RLE output mode, and selects CUDA or CPU device. All pipeline methods are decorated with @timed for performance instrumentation.

Key Responsibilities

Model Loading: Loads SAM checkpoint via sam_model_fast_registry[model_type](checkpoint=sam_checkpoint) and moves to device
Mask Generator Setup: Creates SamAutomaticMaskGenerator with configurable process_batch_size and output_mode="coco_rle"
Image Preprocessing: Accepts base64 strings, raw bytes, or list inputs; converts to BGR numpy arrays via OpenCV
Mask Generation: Currently supports batch size of 1 per SAM limitation; calls mask_generator.generate()
Serialization: Pickle-serializes mask data and base64-encodes for transport

Usage

from custom_handler import SegmentAnythingFastHandler

The handler is configured through a model YAML config:

# model-config.yaml for SAM Fast
handler:
    model_type: "vit_h"
    sam_checkpoint: "sam_vit_h_4b8939.pth"
    process_batch_size: 4

Code Reference

Source Location

File	Lines	Description
`examples/large_models/segment_anything_fast/custom_handler.py`	L1-91	Full handler module (90 lines)
`examples/large_models/segment_anything_fast/custom_handler.py`	L19-91	`SegmentAnythingFastHandler` class definition
`examples/large_models/segment_anything_fast/custom_handler.py`	L25-52	`initialize(ctx)` -- model loading and mask generator setup
`examples/large_models/segment_anything_fast/custom_handler.py`	L54-73	`preprocess(data)` -- base64/bytes to BGR OpenCV array
`examples/large_models/segment_anything_fast/custom_handler.py`	L75-80	`inference(data)` -- mask generation (batch size 1)
`examples/large_models/segment_anything_fast/custom_handler.py`	L82-90	`postprocess(data)` -- pickle + base64 encoding

Signature

class SegmentAnythingFastHandler(BaseHandler):

    def __init__(self):
        super().__init__()
        self.mask_generator = None
        self.initialized = False

    def initialize(self, ctx):
        """
        Load SAM model and create mask generator.

        Reads model_type, sam_checkpoint, process_batch_size from
        ctx.model_yaml_config["handler"]. Loads model via
        sam_model_fast_registry and creates SamAutomaticMaskGenerator.

        Args:
            ctx: TorchServe context with system_properties and model_yaml_config.
        """
        ...

    @timed
    def preprocess(self, data):
        """
        Convert input images to BGR numpy arrays.

        Accepts base64 strings (decoded), raw bytes (opened via PIL),
        or list data (converted to FloatTensor). All converted to
        BGR via cv2.cvtColor.

        Args:
            data (list): List of dicts with "data" or "body" keys.

        Returns:
            list: List of BGR numpy arrays.
        """
        ...

    @timed
    def inference(self, data):
        """
        Generate segmentation masks.

        Currently enforces batch size of 1 via assertion.
        Calls self.mask_generator.generate() on the single image.

        Args:
            data (list): Single-element list of BGR numpy arrays.

        Returns:
            list: COCO RLE encoded mask annotations.
        """
        ...

    @timed
    def postprocess(self, data):
        """
        Serialize mask data for transport.

        Pickle-serializes the mask annotations and base64-encodes
        the result for JSON-safe transport.

        Args:
            data: Mask annotation data from inference.

        Returns:
            list: Single-element list with base64-encoded pickled string.
        """
        ...

Import

# Handler imports
import base64
import io
import pickle
import cv2
import numpy as np
import torch
from PIL import Image
from segment_anything_fast import SamAutomaticMaskGenerator, sam_model_fast_registry
from ts.handler_utils.timer import timed
from ts.torch_handler.base_handler import BaseHandler

I/O Contract

Method	Input	Output	Notes
`initialize(ctx)`	Context with `handler.model_type`, `handler.sam_checkpoint`, `handler.process_batch_size`	None (sets `self.model`, `self.mask_generator`, `self.initialized = True`)	Auto-selects CUDA or CPU
`preprocess(data)`	`list[dict]` with `"data"`/`"body"` containing base64 string, bytes, or list	`list[numpy.ndarray]` -- BGR images	Converts RGB to BGR via `cv2.cvtColor`
`inference(data)`	`list[numpy.ndarray]` (must be length 1)	COCO RLE mask annotations	Asserts `len(data) == 1`
`postprocess(data)`	Mask annotations from inference	`list[str]` -- single base64-encoded pickled string	Uses `pickle.dumps` then `base64.b64encode`

Usage Examples

Example 1: Model Initialization

# From custom_handler.py L25-52: initialize() loads SAM model
def initialize(self, ctx):
    properties = ctx.system_properties
    model_dir = properties.get("model_dir")
    self.device = "cpu"
    if torch.cuda.is_available() and properties.get("gpu_id") is not None:
        self.map_location = "cuda"
        self.device = torch.device(
            self.map_location + ":" + str(properties.get("gpu_id"))
        )
        torch.cuda.set_device(self.device)

    model_type = ctx.model_yaml_config["handler"]["model_type"]
    sam_checkpoint = os.path.join(
        model_dir, ctx.model_yaml_config["handler"]["sam_checkpoint"]
    )
    process_batch_size = ctx.model_yaml_config["handler"]["process_batch_size"]

    self.model = sam_model_fast_registry[model_type](checkpoint=sam_checkpoint)
    self.model.to(self.device)

    self.mask_generator = SamAutomaticMaskGenerator(
        self.model, process_batch_size=process_batch_size, output_mode="coco_rle"
    )
    self.initialized = True

Example 2: Image Preprocessing Pipeline

# From custom_handler.py L54-73: preprocess() handles multiple input formats
@timed
def preprocess(self, data):
    images = []
    for row in data:
        image = row.get("data") or row.get("body")
        if isinstance(image, str):
            image = base64.b64decode(image)
        if isinstance(image, (bytearray, bytes)):
            image = Image.open(io.BytesIO(image))
        else:
            image = torch.FloatTensor(image)
        image = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
        images.append(image)
    return images

Example 3: Postprocess Serialization

# From custom_handler.py L82-90: postprocess() serializes mask data
@timed
def postprocess(self, data):
    serialized_data = pickle.dumps(data)
    base64_encoded_data = base64.b64encode(serialized_data).decode("utf-8")
    return [base64_encoded_data]

Related Pages

Principle:Pytorch_Serve_Instance_Segmentation -- principle for serving instance segmentation models with mask generation
Implementation:Pytorch_Serve_BaseHandler - Parent class providing the handle() orchestration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment