Implementation:Tencent Ncnn RFCN Example
| Knowledge Sources | |
|---|---|
| Domains | Vision, Object_Detection |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
Concrete tool for object detection inference using R-FCN (Region-based Fully Convolutional Networks) with ncnn.
Description
This example demonstrates R-FCN object detection on the PASCAL VOC 20-class dataset using a ResNet-50 backbone. R-FCN differs from Faster R-CNN by using position-sensitive score maps and position-sensitive ROI pooling instead of per-region fully connected layers, making it more efficient for multi-class detection. The image is resized so the shorter side is 224 pixels while preserving aspect ratio, preprocessed with ImageNet BGR mean subtraction (102.9801, 115.9465, 122.7717), and fed along with im_info metadata. Inference runs in two steps: step one extracts position-sensitive class and bbox feature maps (rfcn_cls, rfcn_bbox) along with region proposals (rois); step two iterates over each ROI, applying position-sensitive pooling to obtain per-class probabilities (cls_prob) and bounding box regression (bbox_pred). Post-processing uses class-agnostic bbox regression (offset index 4), per-class NMS with OMP-parallelized quicksort, confidence threshold of 0.6, NMS threshold of 0.3, and a maximum of 100 detections per image.
Usage
Use this example to perform multi-class object detection with R-FCN architecture. Detects 20 VOC classes: aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor.
Code Reference
Source Location
- Repository: Tencent_Ncnn
- File: examples/rfcn.cpp
- Lines: 1-351
Signature
static int detect_rfcn(const cv::Mat& bgr, std::vector<Object>& objects);
static void draw_objects(const cv::Mat& bgr, const std::vector<Object>& objects);
int main(int argc, char** argv);
Import
#include "net.h"
#include <math.h>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| imagepath | const char* (argv[1]) | Yes | Path to input image |
Outputs
| Name | Type | Description |
|---|---|---|
| objects | std::vector<Object> | Detected objects with bounding box (Rect_<float>), class label (int), and probability (float) |
| visualization | cv::Mat | Image with drawn bounding boxes and class labels displayed via cv::imshow |
Usage Examples
Running the Example
./rfcn image.jpg
Key Code Pattern
ncnn::Net rfcn;
rfcn.opt.use_vulkan_compute = true;
rfcn.load_param("rfcn_end2end.param");
rfcn.load_model("rfcn_end2end.bin");
ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR, bgr.cols, bgr.rows, w, h);
const float mean_vals[3] = {102.9801f, 115.9465f, 122.7717f};
in.substract_mean_normalize(mean_vals, 0);
ncnn::Mat im_info(3);
im_info[0] = h; im_info[1] = w; im_info[2] = scale;
// Step 1: extract position-sensitive maps and proposals
ncnn::Extractor ex1 = rfcn.create_extractor();
ex1.input("data", in);
ex1.input("im_info", im_info);
ncnn::Mat rfcn_cls, rfcn_bbox, rois;
ex1.extract("rfcn_cls", rfcn_cls);
ex1.extract("rfcn_bbox", rfcn_bbox);
ex1.extract("rois", rois);
// Step 2: per-ROI position-sensitive pooling
for (int i = 0; i < rois.c; i++)
{
ncnn::Extractor ex2 = rfcn.create_extractor();
ncnn::Mat roi = rois.channel(i);
ex2.input("rfcn_cls", rfcn_cls);
ex2.input("rfcn_bbox", rfcn_bbox);
ex2.input("rois", roi);
ncnn::Mat bbox_pred, cls_prob;
ex2.extract("bbox_pred", bbox_pred);
ex2.extract("cls_prob", cls_prob);
// ... decode bbox regression and classification
}