Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Evidentlyai Evidently Legacy HuggingFace Feature

From Leeroopedia
Knowledge Sources
Domains ML Monitoring, NLP, HuggingFace Integration
Last Updated 2026-02-14 12:00 GMT

Overview

Provides generated feature classes that leverage HuggingFace Transformers models to compute text-based features such as emotion scores, AI-generated text detection, zero-shot classification, toxicity measurement, and PII detection.

Description

The module defines two primary classes:

HuggingFaceFeature is a generic feature class that dispatches to a registry of supported HuggingFace models. It extends both FeatureTypeFieldMixin and DataFeature. The supported models are registered in the _models dictionary, each mapping to a tuple of (ColumnType, list of available parameters, callable function). The currently supported models include:

  • SamLowe/roberta-base-go_emotions - Returns numerical emotion scores for a given label using a text-classification pipeline.
  • openai-community/roberta-base-openai-detector - Detects AI-generated text, returning a categorical label based on a score threshold.
  • MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli - Performs zero-shot classification with configurable labels and a confidence threshold.
  • DaNLP/da-electra-hatespeech-detection - Computes toxicity scores for hate speech detection using the HuggingFace evaluate library.
  • facebook/roberta-hate-speech-dynabench-r4-target - Computes toxicity scores for hate speech using the evaluate library.
  • lakshyakh93/deberta_finetuned_pii - Performs PII (personally identifiable information) detection via token classification.

HuggingFaceToxicityFeature is a specialized feature class for computing toxicity scores. It uses the HuggingFace evaluate library with a configurable model name and toxic label.

Internal helper functions include _samlowe_roberta_base_go_emotions, _openai_detector, _lmnli_fever, _toxicity, _dfp, and _map_labels, which implement the model-specific logic for each supported model.

Usage

Use HuggingFaceFeature when you need to generate text features using any of the supported HuggingFace models within Evidently monitoring pipelines. Use HuggingFaceToxicityFeature specifically for toxicity scoring. These are legacy features intended for use with Evidently's legacy metric and report framework.

Code Reference

Source Location

Signature

class HuggingFaceFeature(FeatureTypeFieldMixin, DataFeature):
    class Config:
        type_alias = "evidently:feature:HuggingFaceFeature"

    column_name: str
    model: str
    params: dict

    def __init__(self, *, column_name: str, model: str, params: dict, display_name: str): ...
    def generate_data(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.Series: ...

class HuggingFaceToxicityFeature(DataFeature):
    class Config:
        type_alias = "evidently:feature:HuggingFaceToxicityFeature"

    __feature_type__: ClassVar = ColumnType.Numerical
    column_name: str
    model: Optional[str]
    toxic_label: Optional[str]

    def __init__(self, *, column_name: str, display_name: str, model: Optional[str] = None, toxic_label: Optional[str] = None): ...
    def generate_data(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.Series: ...

Import

from evidently.legacy.features.hf_feature import HuggingFaceFeature
from evidently.legacy.features.hf_feature import HuggingFaceToxicityFeature

I/O Contract

Inputs (HuggingFaceFeature)

Name Type Required Description
column_name str Yes Name of the text column in the DataFrame to analyze
model str Yes HuggingFace model identifier (must be a key in the _models registry)
params dict Yes Dictionary of model-specific parameters (e.g., label, score_threshold, labels, threshold)
display_name str Yes Human-readable name for the generated feature column

Inputs (HuggingFaceToxicityFeature)

Name Type Required Description
column_name str Yes Name of the text column in the DataFrame to analyze
display_name str Yes Human-readable name for the generated feature column
model Optional[str] No HuggingFace model name for toxicity evaluation (defaults to None)
toxic_label Optional[str] No Label to use for toxicity scoring (defaults to None)

Outputs

Name Type Description
return pd.Series A pandas Series containing the computed feature values (numerical scores or categorical labels depending on the model)

Usage Examples

# Example: Emotion detection using SamLowe/roberta-base-go_emotions
from evidently.legacy.features.hf_feature import HuggingFaceFeature

feature = HuggingFaceFeature(
    column_name="text",
    model="SamLowe/roberta-base-go_emotions",
    params={"label": "joy"},
    display_name="Joy Score"
)

# Example: Zero-shot classification
feature_zs = HuggingFaceFeature(
    column_name="text",
    model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli",
    params={"labels": ["positive", "negative", "neutral"], "threshold": 0.7},
    display_name="Sentiment Classification"
)

# Example: Toxicity scoring
from evidently.legacy.features.hf_feature import HuggingFaceToxicityFeature

toxicity_feature = HuggingFaceToxicityFeature(
    column_name="text",
    display_name="Toxicity Score",
    model="facebook/roberta-hate-speech-dynabench-r4-target",
    toxic_label="hate"
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment