Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Pytorch Serve SAM Fast Handler

From Leeroopedia

Overview

SegmentAnythingFastHandler is a TorchServe handler for serving the Segment Anything Fast model for automatic mask generation. It extends BaseHandler and provides image-to-mask inference with COCO RLE output encoding. The handler accepts base64-encoded or raw bytes images, converts them to BGR via OpenCV, generates segmentation masks, and returns pickle-serialized base64-encoded results.

Field Value
Implementation Name SAM_Fast_Handler
Type Example Handler
Workflow Instance_Segmentation_Serving
Domains Computer_Vision, Instance_Segmentation
Knowledge Sources Pytorch_Serve
Last Updated 2026-02-13 18:52 GMT

Description

The SegmentAnythingFastHandler class implements the full inference lifecycle for the Segment Anything Fast model. During initialization, it loads the SAM model from a checkpoint using the sam_model_fast_registry, configures the SamAutomaticMaskGenerator with COCO RLE output mode, and selects CUDA or CPU device. All pipeline methods are decorated with @timed for performance instrumentation.

Key Responsibilities

  • Model Loading: Loads SAM checkpoint via sam_model_fast_registry[model_type](checkpoint=sam_checkpoint) and moves to device
  • Mask Generator Setup: Creates SamAutomaticMaskGenerator with configurable process_batch_size and output_mode="coco_rle"
  • Image Preprocessing: Accepts base64 strings, raw bytes, or list inputs; converts to BGR numpy arrays via OpenCV
  • Mask Generation: Currently supports batch size of 1 per SAM limitation; calls mask_generator.generate()
  • Serialization: Pickle-serializes mask data and base64-encodes for transport

Usage

from custom_handler import SegmentAnythingFastHandler

The handler is configured through a model YAML config:

# model-config.yaml for SAM Fast
handler:
    model_type: "vit_h"
    sam_checkpoint: "sam_vit_h_4b8939.pth"
    process_batch_size: 4

Code Reference

Source Location

File Lines Description
examples/large_models/segment_anything_fast/custom_handler.py L1-91 Full handler module (90 lines)
examples/large_models/segment_anything_fast/custom_handler.py L19-91 SegmentAnythingFastHandler class definition
examples/large_models/segment_anything_fast/custom_handler.py L25-52 initialize(ctx) -- model loading and mask generator setup
examples/large_models/segment_anything_fast/custom_handler.py L54-73 preprocess(data) -- base64/bytes to BGR OpenCV array
examples/large_models/segment_anything_fast/custom_handler.py L75-80 inference(data) -- mask generation (batch size 1)
examples/large_models/segment_anything_fast/custom_handler.py L82-90 postprocess(data) -- pickle + base64 encoding

Signature

class SegmentAnythingFastHandler(BaseHandler):

    def __init__(self):
        super().__init__()
        self.mask_generator = None
        self.initialized = False

    def initialize(self, ctx):
        """
        Load SAM model and create mask generator.

        Reads model_type, sam_checkpoint, process_batch_size from
        ctx.model_yaml_config["handler"]. Loads model via
        sam_model_fast_registry and creates SamAutomaticMaskGenerator.

        Args:
            ctx: TorchServe context with system_properties and model_yaml_config.
        """
        ...

    @timed
    def preprocess(self, data):
        """
        Convert input images to BGR numpy arrays.

        Accepts base64 strings (decoded), raw bytes (opened via PIL),
        or list data (converted to FloatTensor). All converted to
        BGR via cv2.cvtColor.

        Args:
            data (list): List of dicts with "data" or "body" keys.

        Returns:
            list: List of BGR numpy arrays.
        """
        ...

    @timed
    def inference(self, data):
        """
        Generate segmentation masks.

        Currently enforces batch size of 1 via assertion.
        Calls self.mask_generator.generate() on the single image.

        Args:
            data (list): Single-element list of BGR numpy arrays.

        Returns:
            list: COCO RLE encoded mask annotations.
        """
        ...

    @timed
    def postprocess(self, data):
        """
        Serialize mask data for transport.

        Pickle-serializes the mask annotations and base64-encodes
        the result for JSON-safe transport.

        Args:
            data: Mask annotation data from inference.

        Returns:
            list: Single-element list with base64-encoded pickled string.
        """
        ...

Import

# Handler imports
import base64
import io
import pickle
import cv2
import numpy as np
import torch
from PIL import Image
from segment_anything_fast import SamAutomaticMaskGenerator, sam_model_fast_registry
from ts.handler_utils.timer import timed
from ts.torch_handler.base_handler import BaseHandler

I/O Contract

Method Input Output Notes
initialize(ctx) Context with handler.model_type, handler.sam_checkpoint, handler.process_batch_size None (sets self.model, self.mask_generator, self.initialized = True) Auto-selects CUDA or CPU
preprocess(data) list[dict] with "data"/"body" containing base64 string, bytes, or list list[numpy.ndarray] -- BGR images Converts RGB to BGR via cv2.cvtColor
inference(data) list[numpy.ndarray] (must be length 1) COCO RLE mask annotations Asserts len(data) == 1
postprocess(data) Mask annotations from inference list[str] -- single base64-encoded pickled string Uses pickle.dumps then base64.b64encode

Usage Examples

Example 1: Model Initialization

# From custom_handler.py L25-52: initialize() loads SAM model
def initialize(self, ctx):
    properties = ctx.system_properties
    model_dir = properties.get("model_dir")
    self.device = "cpu"
    if torch.cuda.is_available() and properties.get("gpu_id") is not None:
        self.map_location = "cuda"
        self.device = torch.device(
            self.map_location + ":" + str(properties.get("gpu_id"))
        )
        torch.cuda.set_device(self.device)

    model_type = ctx.model_yaml_config["handler"]["model_type"]
    sam_checkpoint = os.path.join(
        model_dir, ctx.model_yaml_config["handler"]["sam_checkpoint"]
    )
    process_batch_size = ctx.model_yaml_config["handler"]["process_batch_size"]

    self.model = sam_model_fast_registry[model_type](checkpoint=sam_checkpoint)
    self.model.to(self.device)

    self.mask_generator = SamAutomaticMaskGenerator(
        self.model, process_batch_size=process_batch_size, output_mode="coco_rle"
    )
    self.initialized = True

Example 2: Image Preprocessing Pipeline

# From custom_handler.py L54-73: preprocess() handles multiple input formats
@timed
def preprocess(self, data):
    images = []
    for row in data:
        image = row.get("data") or row.get("body")
        if isinstance(image, str):
            image = base64.b64decode(image)
        if isinstance(image, (bytearray, bytes)):
            image = Image.open(io.BytesIO(image))
        else:
            image = torch.FloatTensor(image)
        image = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
        images.append(image)
    return images

Example 3: Postprocess Serialization

# From custom_handler.py L82-90: postprocess() serializes mask data
@timed
def postprocess(self, data):
    serialized_data = pickle.dumps(data)
    base64_encoded_data = base64.b64encode(serialized_data).decode("utf-8")
    return [base64_encoded_data]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment