Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer QuerySentimentDetectionMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for detecting sentiment in user queries provided by Data-Juicer.

Description

QuerySentimentDetectionMapper is a mapper operator that detects the sentiment of user queries (negative, neutral, or positive) and stores the predicted label and confidence score in the sample metadata. It employs a HuggingFace text-classification pipeline with optional Chinese-to-English translation before classification. Samples that already have sentiment annotations are skipped. Requires CUDA acceleration and operates in batched mode.

Usage

Use when you need sentiment-based tagging and filtering of training data, such as curating datasets with specific emotional tone distributions.

Code Reference

Source Location

Signature

@OPERATORS.register_module("query_sentiment_detection_mapper")
class QuerySentimentDetectionMapper(Mapper):
    def __init__(
        self,
        hf_model: str = "mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis",
        zh_to_en_hf_model: Optional[str] = "Helsinki-NLP/opus-mt-zh-en",
        model_params: Dict = {},
        zh_to_en_model_params: Dict = {},
        *,
        label_key: str = MetaKeys.query_sentiment_label,
        score_key: str = MetaKeys.query_sentiment_score,
        **kwargs,
    ):

Import

from data_juicer.ops.mapper.query_sentiment_detection_mapper import QuerySentimentDetectionMapper

I/O Contract

Inputs

Name Type Required Description
hf_model str No HuggingFace model ID for sentiment classification (default: mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis)
zh_to_en_hf_model Optional[str] No Translation model from Chinese to English (default: Helsinki-NLP/opus-mt-zh-en)
model_params Dict No Model parameters for the sentiment classification model
zh_to_en_model_params Dict No Model parameters for the translation model
label_key str No Key name in meta field for the output label (default: query_sentiment_label)
score_key str No Key name in meta field for the output score (default: query_sentiment_score)

Outputs

Name Type Description
meta[label_key] str Predicted sentiment label (negative, neutral, or positive)
meta[score_key] float Confidence score for the predicted sentiment

Usage Examples

process:
  - query_sentiment_detection_mapper:
      hf_model: 'mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis'
      zh_to_en_hf_model: 'Helsinki-NLP/opus-mt-zh-en'

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment