Implementation:Tencent Ncnn YOLO11 OBB Example
| Knowledge Sources | |
|---|---|
| Domains | Vision, Oriented_Object_Detection |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
Concrete tool for oriented bounding box (OBB) detection using YOLO11 with ncnn.
Description
This example implements YOLO11 oriented bounding box (OBB) detection using the ncnn inference framework. Unlike standard axis-aligned detection, OBB detection predicts rotated rectangles that tightly fit objects at arbitrary angles, making it suitable for aerial imagery and document analysis. The model produces two output blobs: a detection blob (w=79, h=21504) containing DFL bbox regression coefficients (16x4) and per-class scores (15 classes), and a rotation blob (w=1, h=21504) containing the predicted angle for each box. Input images are preprocessed with letterbox padding to 1024x1024 resolution. The implementation uses cv::RotatedRect for rotated NMS to filter overlapping oriented detections.
Usage
Use this example when you need to detect objects that appear at various orientations, such as vehicles in satellite imagery, text in rotated documents, or ships in aerial photographs. The oriented bounding boxes provide a much tighter fit than axis-aligned boxes for rotated objects.
Code Reference
Source Location
- Repository: Tencent_Ncnn
- File: examples/yolo11_obb.cpp
- Lines: 1-542
Signature
struct Object
{
cv::RotatedRect rrect;
int label;
float prob;
};
static int detect_yolo11_obb(const cv::Mat& bgr, std::vector<Object>& objects);
static void generate_proposals(int stride, const ncnn::Mat& pred,
const ncnn::Mat& pred_angle,
float prob_threshold, std::vector<Object>& objects);
static void qsort_descent_inplace(std::vector<Object>& objects);
static void nms_sorted_bboxes(const std::vector<Object>& objects,
std::vector<int>& picked, float nms_threshold,
bool agnostic = false);
Import
#include "layer.h"
#include "net.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| image_path | const char* | Yes | Path to input image file |
Outputs
| Name | Type | Description |
|---|---|---|
| objects | std::vector<Object> | Detected objects with rotated bounding boxes (cv::RotatedRect), class labels, and confidence scores |
Model Files
| File | Description |
|---|---|
| yolo11n_obb.ncnn.param | YOLO11-OBB nano model parameter file |
| yolo11n_obb.ncnn.bin | YOLO11-OBB nano model weight file |
Usage Examples
Running the Example
./yolo11_obb image.jpg
Key Code Pattern
ncnn::Net yolo11;
yolo11.opt.use_vulkan_compute = true;
yolo11.load_param("yolo11n_obb.ncnn.param");
yolo11.load_model("yolo11n_obb.ncnn.bin");
const int target_size = 1024;
const float prob_threshold = 0.25f;
const float nms_threshold = 0.45f;
// Letterbox pad to 1024x1024
ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data,
ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h);
const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
in_pad.substract_mean_normalize(0, norm_vals);
ncnn::Extractor ex = yolo11.create_extractor();
ex.input("in0", in_pad);
ncnn::Mat out0; // bbox regression + class scores (w=79, h=21504)
ncnn::Mat out1; // rotation angle (w=1, h=21504)
ex.extract("out0", out0);
ex.extract("out1", out1);
Implementation Details
Preprocessing
Input images are resized while preserving aspect ratio and then letterbox padded to 1024x1024 (a multiple of max_stride=32). Pixel values are normalized by dividing by 255. The padding value is 114.
Output Tensor Layout
The model produces two output tensors:
- out0 (w=79, h=21504): Contains DFL bbox regression (16x4=64 values) and per-class scores (15 classes) for 21504 candidate boxes across three stride levels (8, 16, 32)
- out1 (w=1, h=21504): Contains one rotation angle per candidate box
Rotated NMS
Unlike standard NMS that uses axis-aligned IoU, this implementation computes intersection areas using cv::rotatedRectangleIntersection and cv::contourArea for accurate overlap calculation between rotated rectangles.
Model Conversion
Models are converted from Ultralytics format via a multi-step PNNX pipeline that requires modifying the Python export script for dynamic shape inference and re-exporting with dual input shapes (1024x1024 and 512x512).