Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Roboflow Rf detr ONNX Export

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Model_Export, Deployment
Last Updated 2026-02-08 15:00 GMT

Overview

End-to-end process for exporting a trained RF-DETR model to ONNX format for deployment with inference frameworks such as ONNX Runtime, TensorRT, or OpenVINO.

Description

This workflow covers the model export pipeline from a PyTorch RF-DETR model to ONNX format. It handles model preparation (switching to export mode), dummy input generation, torch.onnx.export with configurable opset version, and optional ONNX graph simplification using onnxsim and a custom graph optimizer. The exported model accepts normalized image tensors and produces bounding box coordinates and class logits (plus optional segmentation masks). Both full model export and backbone-only export are supported.

Usage

Execute this workflow after training a custom RF-DETR model or when deploying a pretrained model to production environments. Use this when you need framework-independent inference (ONNX Runtime), GPU-optimized inference (TensorRT), or edge deployment (OpenVINO). The ONNX format enables integration with serving platforms, mobile runtimes, and custom inference pipelines outside the PyTorch ecosystem.

Execution Steps

Step 1: Install Export Dependencies

Install the ONNX export extension package which includes onnx, onnxsim, and onnxruntime as additional dependencies beyond the base rfdetr package.

Key considerations:

  • Run pip install "rfdetr[onnxexport]" to get all required dependencies
  • The export functionality imports from rfdetr.deploy.export, which requires onnx and onnxsim

Step 2: Load Trained Model

Instantiate the appropriate RF-DETR model class with the trained checkpoint weights. This can be a COCO-pretrained model or a custom fine-tuned checkpoint. The model architecture and resolution are determined by the model class.

Key considerations:

  • Pass the checkpoint path via pretrain_weights parameter
  • The model resolution determines the default input shape for export
  • Both detection and segmentation models can be exported

Step 3: Export to ONNX

Call the export() method which internally prepares the model for export (switching to eval mode and enabling export-specific code paths in the architecture), creates a dummy input tensor of the correct shape, and runs torch.onnx.export with constant folding. The export produces an ONNX file with named inputs (input) and outputs (dets, labels for detection; plus masks for segmentation).

Key considerations:

  • The default opset version is 17; higher versions support more operations
  • Custom input shapes can be specified but must be divisible by 14 (the patch size constraint)
  • The backbone_only option exports just the feature extractor for use in custom pipelines
  • Export is performed on CPU for maximum compatibility

Step 4: Simplify ONNX Model (Optional)

If simplification is enabled, the exported ONNX graph is processed through a custom graph optimizer (OnnxOptimizer) that performs common optimization passes, followed by onnxsim.simplify() which folds constants, removes redundant nodes, and validates output consistency. The simplified model is saved alongside the original.

Key considerations:

  • Simplification improves inference performance and compatibility with some runtimes
  • The output file is saved as inference_model.sim.onnx (or backbone_model.sim.onnx)
  • The simplifier validates outputs against the original model to ensure correctness

Step 5: Validate and Deploy

Verify the exported ONNX model by running inference with ONNX Runtime. The input must be preprocessed identically to the PyTorch pipeline: resize to the model resolution, normalize with ImageNet mean [0.485, 0.456, 0.406] and standard deviation [0.229, 0.224, 0.225], and convert to NCHW float32 tensor format. The model outputs raw boxes and logits which require post-processing (confidence thresholding, coordinate scaling to original image dimensions).

Key considerations:

  • ONNX Runtime inference is straightforward with session.run()
  • For GPU inference, use TensorRT to convert the ONNX model to an optimized engine
  • The model does not include NMS; post-processing is handled externally

Execution Diagram

GitHub URL

Workflow Repository