Implementation:Evidentlyai Evidently Generated Descriptors
| Knowledge Sources | |
|---|---|
| Domains | NLP, Text Analysis, LLM Evaluation, Data Quality |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Provides factory functions that create FeatureDescriptor instances wrapping legacy V1 feature implementations, offering a unified API for text analysis, validation, matching, and LLM-based evaluation descriptors.
Description
The generated_descriptors module is a collection of convenience factory functions that bridge the legacy V1 feature system with the current V2 descriptor API. Each function creates a FeatureDescriptor by instantiating the corresponding legacy feature class and wrapping it, allowing backward-compatible usage while benefiting from the V2 framework.
The module contains the following categories of descriptors:
Text Analysis Descriptors:
- Sentiment() -- Compute sentiment scores for text columns.
- WordCount() -- Count words in text.
- SentenceCount() -- Count sentences in text.
- NonLetterCharacterPercentage() -- Percentage of non-letter characters.
- OOVWordsPercentage() -- Out-of-vocabulary word percentage.
Text Matching Descriptors:
- Contains() / DoesNotContain() -- Check if text contains/excludes specified items.
- BeginsWith() / EndsWith() -- Check prefix/suffix matches.
- IncludesWords() / ExcludesWords() -- Word-level inclusion/exclusion with optional lemmatization.
- WordsPresence() -- Flexible word presence check with multiple modes.
- ItemMatch() / ItemNoMatch() -- Column-to-column item matching.
- WordMatch() / WordNoMatch() -- Column-to-column word matching with lemmatization.
- TriggerWordsPresent() -- Check for trigger word presence.
- RegExp() -- Regular expression pattern matching.
Text Similarity Descriptors:
- BERTScore() -- BERT-based text similarity scoring.
- SemanticSimilarity() -- Sentence-transformer-based semantic similarity.
- ExactMatch() -- Exact equality between columns.
Validation Descriptors:
- IsValidJSON() -- Validate JSON format.
- IsValidPython() -- Validate Python syntax.
- IsValidSQL() -- Validate SQL syntax.
- JSONMatch() -- Compare JSON values across columns.
- JSONSchemaMatch() -- Validate JSON against an expected schema.
- ContainsLink() -- Detect URLs/links in text.
HuggingFace Model Descriptors:
- HuggingFace() -- Apply arbitrary HuggingFace models.
- HuggingFaceToxicity() -- Toxicity detection via HuggingFace.
OpenAI / LLM Descriptors:
- OpenAI() -- Evaluate text using OpenAI models with custom prompts.
- LLMJudge() -- LLM evaluation with a custom prompt template.
- LLMEval() -- General-purpose LLM evaluation.
LLM-as-Judge Evaluation Descriptors:
- BiasLLMEval() -- Evaluate text for bias.
- ToxicityLLMEval() -- Evaluate toxicity.
- NegativityLLMEval() -- Evaluate negativity.
- PIILLMEval() -- Detect personally identifiable information.
- DeclineLLMEval() -- Detect LLM refusals/declines.
- FaithfulnessLLMEval() -- Evaluate faithfulness to context.
- CompletenessLLMEval() -- Evaluate completeness against context.
- CorrectnessLLMEval() -- Evaluate correctness against target output.
- ContextQualityLLMEval() -- Evaluate context quality for a question.
- BinaryClassificationLLMEval() -- Binary classification via LLM.
- MulticlassClassificationLLMEval() -- Multiclass classification via LLM.
Each function follows the same pattern: instantiate the legacy V1 feature class, then wrap it in a FeatureDescriptor with optional alias and tests.
Usage
Use these factory functions when:
- Adding text analysis, validation, or LLM evaluation descriptors to a dataset or report.
- Working with the V2 descriptor API while leveraging legacy feature implementations.
- Building pipelines that need text matching, sentiment analysis, or LLM-based evaluation.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/descriptors/generated_descriptors.py
Signature
# Text analysis
def Sentiment(column_name: str, alias: Optional[str] = None, tests: Optional[List] = None) -> FeatureDescriptor
def WordCount(column_name: str, alias: Optional[str] = None, tests: Optional[List] = None) -> FeatureDescriptor
def SentenceCount(column_name: str, alias: Optional[str] = None, tests: Optional[List] = None) -> FeatureDescriptor
# Text matching
def Contains(column_name: str, items: List[str], case_sensitive: bool = True, mode: str = "any", ...) -> FeatureDescriptor
def DoesNotContain(column_name: str, items: List[str], ...) -> FeatureDescriptor
def BeginsWith(column_name: str, prefix: str, ...) -> FeatureDescriptor
def EndsWith(column_name: str, suffix: str, ...) -> FeatureDescriptor
def RegExp(column_name: str, reg_exp: str, ...) -> FeatureDescriptor
# Validation
def IsValidJSON(column_name: str, ...) -> FeatureDescriptor
def IsValidPython(column_name: str, ...) -> FeatureDescriptor
def IsValidSQL(column_name: str, ...) -> FeatureDescriptor
def JSONSchemaMatch(column_name: str, expected_schema: Dict[str, Type], ...) -> FeatureDescriptor
# Similarity
def BERTScore(columns: List[str], model: str = "bert-base-uncased", ...) -> FeatureDescriptor
def SemanticSimilarity(columns: List[str], model: str = "all-MiniLM-L6-v2", ...) -> FeatureDescriptor
# LLM evaluation
def LLMJudge(provider: str, model: str, template: BaseLLMPromptTemplate, ...) -> FeatureDescriptor
def LLMEval(column_name: str, provider: str, model: str, template: BaseLLMPromptTemplate, ...) -> FeatureDescriptor
def BiasLLMEval(column_name: str, provider: str = "openai", model: str = "gpt-4o-mini", ...) -> FeatureDescriptor
def ToxicityLLMEval(column_name: str, ...) -> FeatureDescriptor
Import
from evidently.descriptors.generated_descriptors import (
Sentiment,
WordCount,
SentenceCount,
Contains,
DoesNotContain,
BeginsWith,
EndsWith,
RegExp,
IsValidJSON,
IsValidPython,
IsValidSQL,
JSONSchemaMatch,
BERTScore,
SemanticSimilarity,
LLMJudge,
LLMEval,
BiasLLMEval,
ToxicityLLMEval,
FaithfulnessLLMEval,
CorrectnessLLMEval,
HuggingFace,
HuggingFaceToxicity,
OpenAI,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| column_name | str | Yes (most functions) | Name of the text column to process |
| columns | List[str] | Yes (multi-column functions) | List of column names to compare (e.g., BERTScore, SemanticSimilarity, ExactMatch) |
| items / words_list | List[str] | Varies | List of items or words to match against |
| model | str | Varies | Model name (HuggingFace, OpenAI, BERT, sentence-transformer) |
| provider | str | Yes (LLM functions) | LLM provider name (e.g., "openai", "anthropic") |
| template | BaseLLMPromptTemplate | Yes (LLMJudge, LLMEval) | Prompt template defining the evaluation task |
| alias | Optional[str] | No | Custom display name for the descriptor |
| tests | Optional[List[Union[DescriptorTest, GenericTest]]] | No | Tests to apply to the descriptor output |
| case_sensitive | bool | No | Whether matching is case-sensitive (default: True) |
| mode | str | No | Matching mode such as "any", "all", "includes_any", "includes_all" |
| lemmatize | bool | No | Whether to lemmatize words before matching (default varies) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | FeatureDescriptor | A descriptor wrapping the legacy V1 feature, ready for use in Dataset.add_columns() or Report configuration |
Usage Examples
Basic Sentiment Analysis
from evidently.descriptors.generated_descriptors import Sentiment
descriptor = Sentiment(column_name="response_text", alias="Response Sentiment")
Text Contains Check with Tests
from evidently.descriptors.generated_descriptors import Contains
from evidently.tests import gt
descriptor = Contains(
column_name="answer",
items=["yes", "no", "maybe"],
case_sensitive=False,
mode="any",
alias="Contains expected answer",
tests=[gt(0.8)],
)
LLM-based Toxicity Evaluation
from evidently.descriptors.generated_descriptors import ToxicityLLMEval
descriptor = ToxicityLLMEval(
column_name="user_message",
provider="openai",
model="gpt-4o-mini",
include_reasoning=True,
alias="Toxicity Check",
)
JSON Schema Validation
from evidently.descriptors.generated_descriptors import JSONSchemaMatch
descriptor = JSONSchemaMatch(
column_name="api_response",
expected_schema={"name": str, "age": int, "email": str},
validate_types=True,
exact_match=False,
)
BERTScore Similarity
from evidently.descriptors.generated_descriptors import BERTScore
descriptor = BERTScore(
columns=["reference_text", "generated_text"],
model="bert-base-uncased",
alias="BERT Similarity",
)
Related Pages
- Environment:Evidentlyai_Evidently_Python_Core_Environment
- Implementation:Evidentlyai_Evidently_Text_Match_Descriptor -- The V2 native TextMatch descriptor that replaces several legacy text matching features
- Implementation:Evidentlyai_Evidently_LLM_Templates -- LLM prompt templates used by LLMJudge, LLMEval, and other LLM-based descriptors
- Implementation:Evidentlyai_Evidently_Metric_Types -- The metric type system that descriptors integrate with through FeatureDescriptor