Implementation:Datajuicer Data juicer VideoFaceBlurMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for blurring faces detected in videos provided by Data-Juicer.
Description
VideoFaceBlurMapper is a mapper operator that detects and blurs faces in video frames to anonymize identities, supporting privacy-preserving data processing. It uses an OpenCV Haar cascade classifier (default: haarcascade_frontalface_alt.xml) to detect faces in each video frame, then applies a configurable blur kernel ("mean", "box", or "gaussian") with adjustable radius to the detected face regions, processing each frame and saving the result to the output directory.
Usage
Use when you need face anonymization for privacy and compliance requirements (such as GDPR) before using video data for training or distribution.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/video_face_blur_mapper.py
Signature
@OPERATORS.register_module("video_face_blur_mapper")
class VideoFaceBlurMapper(Mapper):
def __init__(self, cv_classifier: str = "", blur_type: str = "gaussian", radius: float = 2, save_dir: str = None, *args, **kwargs):
Import
from data_juicer.ops.mapper.video_face_blur_mapper import VideoFaceBlurMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cv_classifier | str | No | OpenCV classifier path for face detection (default: haarcascade_frontalface_alt.xml) |
| blur_type | str | No | Type of blur kernel: "mean", "box", or "gaussian" (default: "gaussian") |
| radius | float | No | Radius of the blur kernel (default: 2) |
| save_dir | str | No | Directory for generated video files; if not specified, saves alongside input files |
Outputs
| Name | Type | Description |
|---|---|---|
| samples | Dict | Transformed samples with face-blurred video file paths |
Usage Examples
process:
- video_face_blur_mapper:
blur_type: "gaussian"
radius: 3