Implementation:Datajuicer Data juicer DWposeDetector

Knowledge Sources	Datajuicer_Data_juicer
Domains	Computer Vision, Pose Estimation, ONNX Inference
Last Updated	2026-02-14 16:00 GMT

Overview

Implements the DWPose whole-body pose detection pipeline, providing functions to detect and visualize body, hand, and face keypoints from images using ONNX-based detection and pose estimation models.

Description

This module is adapted from the IDEA-Research DWPose project and provides a specialized computer vision utility for human pose estimation. It enables video/image mapper operators to extract human pose information from visual data.

Key Classes:

Wholebody -- Loads ONNX detection (YOLOX) and pose estimation models via ONNX Runtime. Performs person detection followed by keypoint estimation with MMPose-to-OpenPose index remapping. Computes a synthetic neck joint from shoulder keypoints.
DWposeDetector -- High-level detector wrapping Wholebody. Normalizes coordinates to [0,1] range, separates body/foot/face/hand keypoints, and renders pose visualization.

Detection Pipeline Functions:

nms() / multiclass_nms() -- Non-maximum suppression for bounding box filtering
demo_postprocess() -- Decodes YOLOX grid-based predictions
preprocess_det() / inference_detector() -- Image preprocessing and person detection
preprocess_pose() / inference() / postprocess() -- Pose estimation with affine transforms and SimCC decoding

Geometric Utilities:

bbox_xyxy2cs() -- Converts bounding boxes to center-scale format
get_warp_matrix() / top_down_affine() -- Affine transformation for cropping pose regions
decode() / get_simcc_maximum() -- SimCC-based keypoint coordinate decoding

Drawing Functions:

draw_bodypose() -- Renders 18-keypoint body skeleton with colored limbs
draw_handpose() -- Renders 21-keypoint hand skeleton with HSV-colored edges
draw_facepose() -- Renders 68-point face landmarks as white dots
draw_pose() -- Combines body, hand, and face visualizations

Detection Utilities:

handDetect() -- Locates hand bounding boxes relative to wrist/elbow/shoulder keypoints
faceDetect() -- Locates face bounding boxes relative to head/eye/ear keypoints

Usage

Used internally by pose estimation mapper operators. Requires ONNX detection and pose model files (e.g., yolox_l.onnx, dw-ll_ucoco_384.onnx).

Code Reference

Source Location

Repository: Datajuicer_Data_juicer
File: data_juicer/ops/common/dwpose_func.py
Lines: 1-934

Signature

class Wholebody:
    def __init__(self, onnx_det, onnx_pose, device): ...
    def __call__(self, oriImg) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: ...

class DWposeDetector:
    def __init__(self, onnx_det, onnx_pose, device): ...
    def __call__(self, oriImg) -> Tuple[body, foot, faces, hands, det_result, pose_image]: ...

def inference_detector(session, oriImg) -> np.ndarray: ...
def inference_pose(session, out_bbox, oriImg) -> Tuple[np.ndarray, np.ndarray]: ...
def draw_pose(pose, H, W) -> np.ndarray: ...
def handDetect(candidate, subset, oriImg) -> List: ...
def faceDetect(candidate, subset, oriImg) -> List: ...

Import

from data_juicer.ops.common.dwpose_func import DWposeDetector, Wholebody, draw_pose

I/O Contract

Inputs

Name	Type	Required	Description
onnx_det	str	Yes	Path to ONNX detection model (e.g., yolox_l.onnx)
onnx_pose	str	Yes	Path to ONNX pose estimation model (e.g., dw-ll_ucoco_384.onnx)
device	str	Yes	Device for inference ("cpu" or "cuda")
oriImg	np.ndarray	Yes	Input image as numpy array (H, W, C) in BGR/RGB format

Outputs

Name	Type	Description
body	np.ndarray	Body keypoints normalized to [0,1], shape (N, 18, 2)
foot	np.ndarray	Foot keypoints, shape (N, 6, 2)
faces	np.ndarray	Face landmarks, shape (N, 68, 2)
hands	np.ndarray	Hand keypoints, shape (2N, 21, 2)
det_result	np.ndarray	Person detection bounding boxes
pose_image	np.ndarray	Rendered pose visualization image

Usage Examples

from data_juicer.ops.common.dwpose_func import DWposeDetector
import numpy as np

# Initialize detector with ONNX models
detector = DWposeDetector(
    onnx_det="models/yolox_l.onnx",
    onnx_pose="models/dw-ll_ucoco_384.onnx",
    device="cuda",
)

# Detect pose in an image
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
body, foot, faces, hands, det_result, pose_image = detector(image)

print(f"Detected {body.shape[0]} persons")
print(f"Body keypoints shape: {body.shape}")  # (N, 18, 2)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment