Implementation:Datajuicer Data juicer VideoWholeBodyPoseEstimationMapper

Knowledge Sources	Datajuicer_Data_juicer
Domains	Data_Processing, Mapping
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for extracting 2D whole-body pose keypoints from video frames provided by Data-Juicer.

Description

VideoWholeBodyPoseEstimationMapper extracts frames uniformly from videos, runs a YOLOX-based person detector (ONNX model) to locate human subjects, then applies the DWPose model (ONNX) to estimate whole-body keypoints including body, hands, feet, and face for each detected person. The pose estimation results are stored in sample metadata with separate arrays for each keypoint category. The operator supports configurable frame counts, video segmentation by duration, and optional visualization output saved to a specified directory.

Usage

Use when you need to extract human pose annotations from video datasets for applications in action recognition, motion capture dataset creation, and human-centric video understanding.

Code Reference

Source Location

Repository: Datajuicer_Data_juicer
File: data_juicer/ops/mapper/video_whole_body_pose_estimation_mapper.py

Signature

@OPERATORS.register_module("video_whole_body_pose_estimation_mapper")
class VideoWholeBodyPoseEstimationMapper(Mapper):
    def __init__(self, onnx_det_model: str = "yolox_l.onnx",
                 onnx_pose_model: str = "dw-ll_ucoco_384.onnx",
                 frame_num: PositiveInt = 3,
                 duration: float = 0,
                 tag_field_name: str = MetaKeys.pose_estimation_tags,
                 frame_dir: str = DATA_JUICER_ASSETS_CACHE,
                 if_save_visualization: bool = False,
                 save_visualization_dir: str = DATA_JUICER_ASSETS_CACHE,
                 *args, **kwargs):

Import

from data_juicer.ops.mapper.video_whole_body_pose_estimation_mapper import VideoWholeBodyPoseEstimationMapper

I/O Contract

Inputs

Name	Type	Required	Description
onnx_det_model	str	No	Path to the YOLOX detection ONNX model. Default: "yolox_l.onnx"
onnx_pose_model	str	No	Path to the DWPose estimation ONNX model. Default: "dw-ll_ucoco_384.onnx"
frame_num	PositiveInt	No	Number of frames to extract per video or per segment. Default: 3
duration	float	No	Duration of each video segment in seconds. 0 means the entire video. Default: 0
tag_field_name	str	No	Field name to store the pose estimation tags. Default: "pose_estimation_tags"
frame_dir	str	No	Output directory to save extracted frames. Default: DATA_JUICER_ASSETS_CACHE
if_save_visualization	bool	No	Whether to save visualization results. Default: False
save_visualization_dir	str	No	Path for saving visualization results. Default: DATA_JUICER_ASSETS_CACHE

Outputs

Name	Type	Description
sample[Fields.meta][tag_field_name]["body_keypoints"]	list	Body keypoints for each frame
sample[Fields.meta][tag_field_name]["foot_keypoints"]	list	Foot keypoints for each frame
sample[Fields.meta][tag_field_name]["faces_keypoints"]	list	Face keypoints for each frame
sample[Fields.meta][tag_field_name]["hands_keypoints"]	list	Hand keypoints for each frame
sample[Fields.meta][tag_field_name]["bbox_results_list"]	list	Bounding box detection results for each frame

Usage Examples

process:
  - video_whole_body_pose_estimation_mapper:
      frame_num: 5
      duration: 2.0
      if_save_visualization: true
      save_visualization_dir: "./pose_visualizations"

Related Pages

Environment:Datajuicer_Data_juicer_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment