Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer ImageTaggingMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for generating descriptive tags for images using the RAM Plus model provided by Data-Juicer.

Description

ImageTaggingMapper is a mapper operator that generates image tags using the RAM (Recognize Anything Model) Plus model with a Swin Large backbone (ram_plus_swin_large_14m). It processes images at 384x384 resolution, runs the model's generate_tag method to produce pipe-delimited tag strings, splits them into individual words, sorts by frequency using a Counter, and stores the resulting tag array in sample metadata under the configured field name (default: image_tags). Skips reprocessing if tags already exist. Requires CUDA acceleration and approximately 9GB of GPU memory.

Usage

Use when you need automated semantic annotation of images without manual labeling, enabling content-based filtering, search, and organization of image datasets.

Code Reference

Source Location

Signature

@OPERATORS.register_module("image_tagging_mapper")
class ImageTaggingMapper(Mapper):
    def __init__(self,
                 tag_field_name: str = MetaKeys.image_tags,
                 *args, **kwargs):

Import

from data_juicer.ops.mapper.image_tagging_mapper import ImageTaggingMapper

I/O Contract

Inputs

Name Type Required Description
tag_field_name str No Field name to store the tags in metadata, defaults to MetaKeys.image_tags

Outputs

Name Type Description
samples Dict Transformed samples with frequency-sorted image tags stored in meta field

Usage Examples

process:
  - image_tagging_mapper:
      tag_field_name: "image_tags"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment