Heuristic:PeterL1n BackgroundMattingV2 Refine Mode Selection

Knowledge Sources	BackgroundMattingV2 Model Usage
Domains	Computer_Vision, Optimization
Last Updated	2026-02-09 02:00 GMT

Overview

Use `sampling` mode for real-time applications (fixed computation budget), `thresholding` for quality-critical image editing, and `full` only for debugging.

Description

The refinement network supports three modes that control how error-prone patches are selected for full-resolution refinement. The mode choice directly affects the computation/quality trade-off at inference time: `sampling` provides deterministic computation cost, `thresholding` provides adaptive quality, and `full` refines everything (no selection).

Usage

Use this heuristic when deploying MattingRefine for production or configuring inference scripts. The training script defaults to `sampling` mode. Switch modes based on whether your application prioritizes latency predictability or output quality.

The Insight (Rule of Thumb)

Action: Choose `refine_mode` based on application requirements.
Values:
- `sampling` (default): Refines a fixed number of top-error pixels (default 80,000). Best for live/real-time applications where computation per frame must be bounded.
- `thresholding`: Refines all pixels above an error threshold (default 0.1 for export, 0.7 for inference scripts). Best for image editing where quality matters more than speed.
- `full`: Refines the entire image with regular Conv2d. Only for debugging; eliminates the efficiency benefit of selective refinement.
Training default: `sampling` mode is used during training (`train_refine.py:53`).
Trade-off: `sampling` has fixed computation but may miss some errors; `thresholding` adapts to content but has variable computation; `full` is slowest but highest quality.

Reasoning

The selective refinement architecture is the key innovation of BackgroundMattingV2. The coarse network predicts an error map, and only patches with high predicted error are refined at full resolution. `sampling` mode selects a fixed count of top-error patches, guaranteeing bounded VRAM and compute regardless of scene complexity. `thresholding` mode refines all patches above a threshold, which is content-adaptive but can spike in computation for complex scenes. The threshold defaults differ between contexts: training/export uses 0.1 (catch more errors for quality), while inference scripts use 0.7 (more conservative for speed).

Code evidence from `model/refiner.py:171-184`:

if self.mode == 'sampling':
    # Sampling mode.
    b, _, h, w = err.shape
    err = err.view(b, -1)
    idx = err.topk(self.sample_pixels // 16, dim=1, sorted=False).indices
    ref = torch.zeros_like(err)
    ref.scatter_(1, idx, 1.)
    if self.prevent_oversampling:
        ref.mul_(err.gt(0).float())
    ref = ref.view(b, 1, h, w)
else:
    # Thresholding mode.
    ref = err.gt(self.threshold).float()

Documentation evidence from `doc/model_usage.md:138-140`:

sampling: suitable for live applications where computation per frame has a fixed upperbound.
thresholding: suitable for image editing where quality outweighs speed.
full: only used for debugging.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment