Principle:Obss Sahi Per Slice Detection
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Computer_Vision, Inference |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
The process of running object detection inference on individual image slices and remapping the resulting prediction coordinates back to the original full-image coordinate space.
Description
Per-slice detection is the core inference step in sliced prediction. Each image slice produced by the slicing step is fed to the detection model independently. The model produces bounding boxes, confidence scores, and optionally masks in the coordinate frame of the slice.
Since each slice is a crop from a larger image, the raw prediction coordinates are relative to the slice origin, not the full image. A critical coordinate remapping step shifts all predictions by the slice's starting pixel offset (shift_amount) and records the full image dimensions (full_shape) so that downstream merging operates in a consistent coordinate space.
This per-slice approach exploits the fact that small objects become proportionally larger within each slice, dramatically improving detection recall for small objects that would be missed at the full image scale.
Usage
Use per-slice detection as the third step in the SAHI sliced inference pipeline, after image slicing. This step is applied iteratively to each slice. The key parameters are:
- shift_amount: The [x, y] pixel offset of the slice within the original image
- full_shape: The [height, width] of the original image for mask remapping
Theoretical Basis
The coordinate remapping follows a simple translation:
For masks, the slice-level boolean mask is placed at the correct offset within a full-image-sized canvas:
# Pseudocode for coordinate remapping
def remap_prediction(prediction, shift_x, shift_y, full_h, full_w):
prediction.bbox += [shift_x, shift_y, shift_x, shift_y]
if prediction.mask:
full_mask = zeros(full_h, full_w)
full_mask[shift_y:shift_y+slice_h, shift_x:shift_x+slice_w] = prediction.mask
prediction.mask = full_mask
return prediction
The pipeline processes each slice independently (embarrassingly parallel in principle), then collects all remapped predictions for the merging step.