Implementation:LaurentMazare Tch rs YOLO Detection

Knowledge Sources	LaurentMazare_Tch_rs
Domains	Object Detection, Computer Vision, Image Processing
Last Updated	2026-02-08 00:00 GMT

Overview

Performs YOLO v3 object detection on input images, including model loading, confidence thresholding, non-maximum suppression, and bounding box annotation.

Description

This module implements the full YOLO v3 inference pipeline as a command-line application. It loads a Darknet configuration and pre-trained weights, processes one or more input images, and produces annotated output images with detected object bounding boxes.

The pipeline consists of several stages:

Image preprocessing: Input images are loaded and resized to the network's expected dimensions (specified in the .cfg file). The resized image is normalized to [0, 1] range and batched.

Detection: The model produces raw predictions which are processed by the report function. Predictions are filtered by a confidence threshold of 0.5, and each detection is assigned to the class with the highest score. Bounding boxes are extracted in center-format (x, y, w, h) and converted to corner-format (xmin, ymin, xmax, ymax).

Non-maximum suppression (NMS): For each class, detections are sorted by confidence in descending order. Each detection is compared against all previously accepted detections using Intersection over Union (IoU). Detections with IoU exceeding the NMS threshold of 0.4 against any accepted detection are suppressed.

Annotation: Surviving bounding boxes are drawn on the original (non-resized) image by scaling coordinates back to original dimensions. The draw_rect function paints blue (RGB 0,0,1) rectangles using 2-pixel wide borders. Class names are printed to stdout using COCO class labels.

Annotated images are saved as output-{index:05}.jpg files.

Usage

Use this module for running YOLO v3 object detection on images from the command line. It requires the Darknet configuration file (yolo-v3.cfg), pre-trained weights in .ot format, and one or more input images as arguments.

Code Reference

Source Location

Repository: LaurentMazare_Tch_rs
File: examples/yolo/main.rs
Lines: 1-153

Signature

#[derive(Debug, Clone, Copy)]
struct Bbox {
    xmin: f64,
    ymin: f64,
    xmax: f64,
    ymax: f64,
    confidence: f64,
}

fn iou(b1: &Bbox, b2: &Bbox) -> f64
pub fn draw_rect(t: &mut Tensor, x1: i64, x2: i64, y1: i64, y2: i64)
pub fn report(pred: &Tensor, img: &Tensor, w: i64, h: i64) -> Result<Tensor>
pub fn main() -> Result<()>

Import

use anyhow::{ensure, Result};
use tch::nn::ModuleT;
use tch::vision::image;
use tch::Tensor;

I/O Contract

Input	Type	Description
args[1]	Path	Path to pre-trained weights file (.ot format)
args[2..]	Paths	One or more input image file paths
CONFIG_NAME	&str	Path to Darknet config ("examples/yolo/yolo-v3.cfg")
CONFIDENCE_THRESHOLD	f64	Minimum objectness confidence (0.5)
NMS_THRESHOLD	f64	IoU threshold for non-maximum suppression (0.4)

Output	Type	Description
report return	Tensor	Annotated image tensor with bounding boxes drawn, shape [3, H, W]
iou return	f64	Intersection over Union between two bounding boxes
Saved images	JPEG files	output-00000.jpg, output-00001.jpg, etc.
Stdout	Text	Class name and bounding box details for each detection

Bbox Field	Type	Description
xmin, ymin	f64	Top-left corner coordinates (in network input space)
xmax, ymax	f64	Bottom-right corner coordinates (in network input space)
confidence	f64	Objectness confidence score

Usage Examples

// Command-line usage:
// cargo run --example yolo -- yolo-v3.ot photo1.jpg photo2.jpg

// Programmatic usage:
use tch::nn::ModuleT;
use tch::vision::image;

// Load model
let mut vs = tch::nn::VarStore::new(tch::Device::Cpu);
let darknet = darknet::parse_config("examples/yolo/yolo-v3.cfg")?;
let model = darknet.build_model(&vs.root())?;
vs.load("yolo-v3.ot")?;

// Process an image
let original_image = image::load("photo.jpg")?;
let net_width = darknet.width()?;
let net_height = darknet.height()?;
let resized = image::resize(&original_image, net_width, net_height)?;
let input = resized.unsqueeze(0).to_kind(tch::Kind::Float) / 255.;

// Run detection
let predictions = model.forward_t(&input, false).squeeze();

// Apply confidence filtering, NMS, and draw bounding boxes
let annotated = report(&predictions, &original_image, net_width, net_height)?;
image::save(&annotated, "output-00000.jpg")?;

Related Pages

Principle:LaurentMazare_Tch_rs_YOLO_Object_Detection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment