Implementation:Evidentlyai Evidently Legacy HuggingFace Feature

Knowledge Sources	Evidentlyai_Evidently
Domains	ML Monitoring, NLP, HuggingFace Integration
Last Updated	2026-02-14 12:00 GMT

Overview

Provides generated feature classes that leverage HuggingFace Transformers models to compute text-based features such as emotion scores, AI-generated text detection, zero-shot classification, toxicity measurement, and PII detection.

Description

The module defines two primary classes:

HuggingFaceFeature is a generic feature class that dispatches to a registry of supported HuggingFace models. It extends both FeatureTypeFieldMixin and DataFeature. The supported models are registered in the _models dictionary, each mapping to a tuple of (ColumnType, list of available parameters, callable function). The currently supported models include:

SamLowe/roberta-base-go_emotions - Returns numerical emotion scores for a given label using a text-classification pipeline.
openai-community/roberta-base-openai-detector - Detects AI-generated text, returning a categorical label based on a score threshold.
MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli - Performs zero-shot classification with configurable labels and a confidence threshold.
DaNLP/da-electra-hatespeech-detection - Computes toxicity scores for hate speech detection using the HuggingFace evaluate library.
facebook/roberta-hate-speech-dynabench-r4-target - Computes toxicity scores for hate speech using the evaluate library.
lakshyakh93/deberta_finetuned_pii - Performs PII (personally identifiable information) detection via token classification.

HuggingFaceToxicityFeature is a specialized feature class for computing toxicity scores. It uses the HuggingFace evaluate library with a configurable model name and toxic label.

Internal helper functions include _samlowe_roberta_base_go_emotions, _openai_detector, _lmnli_fever, _toxicity, _dfp, and _map_labels, which implement the model-specific logic for each supported model.

Usage

Use HuggingFaceFeature when you need to generate text features using any of the supported HuggingFace models within Evidently monitoring pipelines. Use HuggingFaceToxicityFeature specifically for toxicity scoring. These are legacy features intended for use with Evidently's legacy metric and report framework.

Code Reference

Source Location

Repository: Evidentlyai_Evidently
File: src/evidently/legacy/features/hf_feature.py

Signature

class HuggingFaceFeature(FeatureTypeFieldMixin, DataFeature):
    class Config:
        type_alias = "evidently:feature:HuggingFaceFeature"

    column_name: str
    model: str
    params: dict

    def __init__(self, *, column_name: str, model: str, params: dict, display_name: str): ...
    def generate_data(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.Series: ...

class HuggingFaceToxicityFeature(DataFeature):
    class Config:
        type_alias = "evidently:feature:HuggingFaceToxicityFeature"

    __feature_type__: ClassVar = ColumnType.Numerical
    column_name: str
    model: Optional[str]
    toxic_label: Optional[str]

    def __init__(self, *, column_name: str, display_name: str, model: Optional[str] = None, toxic_label: Optional[str] = None): ...
    def generate_data(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.Series: ...

Import

from evidently.legacy.features.hf_feature import HuggingFaceFeature
from evidently.legacy.features.hf_feature import HuggingFaceToxicityFeature

I/O Contract

Inputs (HuggingFaceFeature)

Name	Type	Required	Description
column_name	str	Yes	Name of the text column in the DataFrame to analyze
model	str	Yes	HuggingFace model identifier (must be a key in the _models registry)
params	dict	Yes	Dictionary of model-specific parameters (e.g., label, score_threshold, labels, threshold)
display_name	str	Yes	Human-readable name for the generated feature column

Inputs (HuggingFaceToxicityFeature)

Name	Type	Required	Description
column_name	str	Yes	Name of the text column in the DataFrame to analyze
display_name	str	Yes	Human-readable name for the generated feature column
model	Optional[str]	No	HuggingFace model name for toxicity evaluation (defaults to None)
toxic_label	Optional[str]	No	Label to use for toxicity scoring (defaults to None)

Outputs

Name	Type	Description
return	pd.Series	A pandas Series containing the computed feature values (numerical scores or categorical labels depending on the model)

Usage Examples

# Example: Emotion detection using SamLowe/roberta-base-go_emotions
from evidently.legacy.features.hf_feature import HuggingFaceFeature

feature = HuggingFaceFeature(
    column_name="text",
    model="SamLowe/roberta-base-go_emotions",
    params={"label": "joy"},
    display_name="Joy Score"
)

# Example: Zero-shot classification
feature_zs = HuggingFaceFeature(
    column_name="text",
    model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli",
    params={"labels": ["positive", "negative", "neutral"], "threshold": 0.7},
    display_name="Sentiment Classification"
)

# Example: Toxicity scoring
from evidently.legacy.features.hf_feature import HuggingFaceToxicityFeature

toxicity_feature = HuggingFaceToxicityFeature(
    column_name="text",
    display_name="Toxicity Score",
    model="facebook/roberta-hate-speech-dynabench-r4-target",
    toxic_label="hate"
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment