Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Tencent Ncnn Bounding Box Decoding

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Object_Detection
Last Updated 2026-02-09 00:00 GMT

Overview

Algorithm for converting raw neural network detection output tensors into spatial bounding box coordinates with class predictions, using Distribution Focal Loss (DFL) decoding for anchor-free detectors.

Description

Modern anchor-free object detectors (YOLOv8, YOLO11, NanoDet-Plus) encode bounding box predictions as distributions over discrete offset bins rather than direct coordinate regression. This approach, called Distribution Focal Loss (DFL), represents each box edge distance as a probability distribution over 16 bins (reg_max=16), then computes the expected value as the decoded distance.

The decoding process involves: (1) applying sigmoid activation to class scores, (2) filtering by confidence threshold, (3) computing DFL softmax over 16 bins per box edge, (4) computing expected distance values, and (5) converting grid-relative offsets to absolute pixel coordinates using stride-specific grid positions.

Older anchor-based detectors (YOLOv5) use a simpler decoding with predefined anchor boxes, where predictions are offsets relative to anchor positions.

Usage

Use DFL decoding for anchor-free detectors (YOLOv8, YOLO11, NanoDet-Plus). Use anchor-based decoding for older models (YOLOv5, YOLOv3). The output tensor format determines which decoding method to apply.

Theoretical Basis

DFL (Distribution Focal Loss) decoding:

For each detection at grid position (x, y) with stride s:

di=j=015jsoftmax(predi,0:16)[j],i{left,top,right,bottom}

Then convert to box coordinates: x0=(x+0.5dleft)×s y0=(y+0.5dtop)×s x1=(x+0.5+dright)×s y1=(y+0.5+dbottom)×s

Pseudo-code:

// Abstract DFL decode algorithm
for each grid_position (gx, gy) at each stride:
    class_scores = sigmoid(raw_classes)
    if max(class_scores) < threshold: continue

    for each edge in {left, top, right, bottom}:
        bins = raw_pred[edge * 16 : (edge+1) * 16]
        distance = sum(softmax(bins) * range(16))

    x0 = (gx + 0.5 - d_left) * stride
    y0 = (gy + 0.5 - d_top) * stride
    x1 = (gx + 0.5 + d_right) * stride
    y1 = (gy + 0.5 + d_bottom) * stride

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment