Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:PeterL1n BackgroundMattingV2 Model export

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Image_Matting, Model_Deployment
Last Updated 2026-02-09 02:30 GMT

Overview

Export trained matting models to TorchScript or ONNX format for production deployment, enabling inference without the original Python source code.

Description

This workflow converts trained PyTorch matting model checkpoints into deployment-optimized formats. Two export paths are supported:

TorchScript: Produces a self-contained .pth file containing both architecture and weights via torch.jit.script. The exported model can be loaded in Python or C++ environments without needing the model source code. A wrapper class hoists configurable attributes (backbone_scale, refine_mode, refine_sample_pixels, refine_threshold) to the top level for easy runtime adjustment.

ONNX: Produces a cross-platform .onnx file using torch.onnx.export with configurable opset version and constant folding. Dynamic axes enable variable batch size and resolution at inference time. Due to the novel patch-based refinement architecture, multiple compatibility options are provided for the crop and replace operations (roi_align/gather for cropping, scatter_nd/scatter_element for replacing).

Both formats support float32 and float16 precision. The ONNX export includes an optional validation step that compares PyTorch and ONNX outputs to verify numerical accuracy.

Usage

Execute this workflow after training is complete and you need to deploy the matting model in production environments. Use TorchScript for PyTorch/LibTorch (Python or C++) deployment scenarios. Use ONNX for cross-framework deployment (TensorRT, ONNX Runtime, or other ONNX-compatible backends). Choose based on your target inference platform and performance requirements.

Execution Steps

Step 1: Checkpoint selection

Identify the best trained checkpoint from the training pipeline output. The checkpoint must be a PyTorch state_dict .pth file produced by the training scripts. Verify the checkpoint matches the intended backbone architecture (ResNet50, ResNet101, or MobileNetV2).

Key considerations:

  • Checkpoints contain only weights (state_dict), not architecture definition
  • The backbone architecture must be specified explicitly during export
  • Select based on validation loss from TensorBoard training logs

Step 2: Model instantiation and weight loading

Create the model architecture with the target configuration. For TorchScript export, the model is wrapped in a special wrapper class that hoists configurable inference parameters to the top level for post-load adjustment. For ONNX export, the model is instantiated with the exact inference configuration (refinement mode, sample pixels, threshold) baked in. Load the checkpoint weights and set the model to evaluation mode.

Key considerations:

  • TorchScript wrapper enables runtime parameter changes after loading
  • ONNX models have fixed refinement configuration at export time
  • For ONNX, choose compatibility options for patch crop/replace methods based on target backend
  • All parameters are frozen (requires_grad=False) before export

Step 3: Precision configuration

Optionally convert the model to half-precision (float16) for reduced model size and faster inference on supported hardware. This is applied after weight loading but before the export step.

Key considerations:

  • float16 reduces model size by approximately 50%
  • float16 inference requires GPU hardware with half-precision support
  • Some operations may have reduced numerical accuracy in float16

Step 4: Model export

For TorchScript: Apply torch.jit.script to trace and compile the model into TorchScript IR, then save to a .pth file. The resulting file is self-contained with architecture and weights.

For ONNX: Generate dummy input tensors and call torch.onnx.export with the configured opset version, constant folding option, named inputs/outputs, and dynamic axes for batch, height, and width dimensions. The export produces a .onnx file.

ONNX compatibility options:

  • Patch crop methods: unfold (unlikely to work), roi_align (recommended), gather
  • Patch replace methods: scatter_nd (faster when supported), scatter_element (wider compatibility)
  • Opset versions 11-12 are recommended
  • Thresholding mode may have better backend compatibility than sampling mode

Step 5: Export validation

For ONNX exports, optionally run a validation step that compares outputs between the original PyTorch model and the exported ONNX model using ONNX Runtime. Test with different input dimensions than used during export to verify dynamic axes work correctly. Validation passes if maximum absolute error across all outputs is below 0.005.

What happens:

  • Create test inputs at a different resolution (720x1280) than export resolution (1080x1920)
  • Run inference through both PyTorch and ONNX Runtime
  • Compare all output tensors element-wise
  • Report maximum absolute difference per output (pha, fgr, pha_sm, fgr_sm, err_sm, ref_sm)

Execution Diagram

GitHub URL

Workflow Repository