Implementation:Datajuicer Data juicer ImageRemoveBackgroundMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for removing backgrounds from images in the dataset provided by Data-Juicer.
Description
ImageRemoveBackgroundMapper is a mapper operator that removes the background from images using the rembg library (backed by ONNX Runtime). It supports optional alpha matting with configurable foreground/background thresholds and erosion size for smoother edges. A custom background color can be specified via the bgcolor parameter. Processed images are saved in PNG format to a configurable output directory, and source file paths in the sample are updated accordingly.
Usage
Use when you need to isolate foreground subjects from their backgrounds for creating clean subject-focused training data, image segmentation datasets, and compositing workflows.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/image_remove_background_mapper.py
Signature
@OPERATORS.register_module("image_remove_background_mapper")
class ImageRemoveBackgroundMapper(Mapper):
def __init__(self,
alpha_matting: bool = False,
alpha_matting_foreground_threshold: int = 240,
alpha_matting_background_threshold: int = 10,
alpha_matting_erode_size: int = 10,
bgcolor: Optional[Tuple[int, int, int, int]] = None,
save_dir: str = None,
*args, **kwargs):
Import
from data_juicer.ops.mapper.image_remove_background_mapper import ImageRemoveBackgroundMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| alpha_matting | bool | No | Whether to use alpha matting, defaults to False |
| alpha_matting_foreground_threshold | int | No | Foreground threshold for alpha matting, defaults to 240 |
| alpha_matting_background_threshold | int | No | Background threshold for alpha matting, defaults to 10 |
| alpha_matting_erode_size | int | No | Erosion size for alpha matting, defaults to 10 |
| bgcolor | Optional[Tuple[int, int, int, int]] | No | Background color for the cutout image (RGBA), defaults to None |
| save_dir | str | No | Directory to store generated images; if not specified, saves in same directory as input |
Outputs
| Name | Type | Description |
|---|---|---|
| samples | Dict | Transformed samples with background-removed image paths updated |
Usage Examples
process:
- image_remove_background_mapper:
alpha_matting: false