Implementation:Datajuicer Data juicer QuerySentimentDetectionMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for detecting sentiment in user queries provided by Data-Juicer.
Description
QuerySentimentDetectionMapper is a mapper operator that detects the sentiment of user queries (negative, neutral, or positive) and stores the predicted label and confidence score in the sample metadata. It employs a HuggingFace text-classification pipeline with optional Chinese-to-English translation before classification. Samples that already have sentiment annotations are skipped. Requires CUDA acceleration and operates in batched mode.
Usage
Use when you need sentiment-based tagging and filtering of training data, such as curating datasets with specific emotional tone distributions.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/query_sentiment_detection_mapper.py
Signature
@OPERATORS.register_module("query_sentiment_detection_mapper")
class QuerySentimentDetectionMapper(Mapper):
def __init__(
self,
hf_model: str = "mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis",
zh_to_en_hf_model: Optional[str] = "Helsinki-NLP/opus-mt-zh-en",
model_params: Dict = {},
zh_to_en_model_params: Dict = {},
*,
label_key: str = MetaKeys.query_sentiment_label,
score_key: str = MetaKeys.query_sentiment_score,
**kwargs,
):
Import
from data_juicer.ops.mapper.query_sentiment_detection_mapper import QuerySentimentDetectionMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| hf_model | str | No | HuggingFace model ID for sentiment classification (default: mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis) |
| zh_to_en_hf_model | Optional[str] | No | Translation model from Chinese to English (default: Helsinki-NLP/opus-mt-zh-en) |
| model_params | Dict | No | Model parameters for the sentiment classification model |
| zh_to_en_model_params | Dict | No | Model parameters for the translation model |
| label_key | str | No | Key name in meta field for the output label (default: query_sentiment_label) |
| score_key | str | No | Key name in meta field for the output score (default: query_sentiment_score) |
Outputs
| Name | Type | Description |
|---|---|---|
| meta[label_key] | str | Predicted sentiment label (negative, neutral, or positive) |
| meta[score_key] | float | Confidence score for the predicted sentiment |
Usage Examples
process:
- query_sentiment_detection_mapper:
hf_model: 'mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis'
zh_to_en_hf_model: 'Helsinki-NLP/opus-mt-zh-en'