Implementation:Evidentlyai Evidently Custom Descriptors
| Knowledge Sources | |
|---|---|
| Domains | Descriptors, Data Processing, Extensibility |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Descriptor classes that allow users to apply custom Python functions to individual columns or entire datasets for computing derived features.
Description
This module provides two custom descriptor classes that enable user-defined transformations:
CustomColumnDescriptor applies a custom callable to a single named column. The constructor accepts either a string (fully qualified function name) or a callable. When a callable is provided, it is cached internally and the string representation is derived from func.__module__.__name__. The generate_data method retrieves the column from the dataset and applies the stored callable, returning a DatasetColumn.
CustomDescriptor applies a custom callable to the entire Dataset object, enabling transformations that may depend on multiple columns or complex logic. Like CustomColumnDescriptor, it accepts either a string or a callable, with the same caching and naming logic. The generate_data method passes the full dataset to the callable.
Both classes:
- Store the function reference as a string (func field) for serialization, with the actual callable held in a Pydantic PrivateAttr
- Generate default aliases from the function path (e.g., custom_column_descriptor:module.func_name)
- Support optional tests via the tests parameter
- Raise ValueError if invoked without a configured callable (e.g., after deserialization without restoring the callable)
Usage
Use CustomColumnDescriptor when your transformation logic operates on a single column (e.g., text preprocessing, feature extraction from one field). Use CustomDescriptor when the transformation requires access to multiple columns or the full dataset context.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File: src/evidently/descriptors/_custom_descriptors.py
Signature
CustomColumnCallable = Callable[[DatasetColumn], DatasetColumn]
class CustomColumnDescriptor(Descriptor):
column_name: str
func: str
_func: Optional[CustomColumnCallable] = PrivateAttr(None)
def __init__(
self, column_name: str, func: Union[str, CustomColumnCallable],
alias: Optional[str] = None, tests: Optional[List[AnyDescriptorTest]] = None,
):
def generate_data(self, dataset: Dataset, options: Options) -> Union[DatasetColumn, Dict[str, DatasetColumn]]:
def list_input_columns(self) -> Optional[List[str]]:
CustomDescriptorCallable = Callable[[Dataset], Union[DatasetColumn, Dict[str, DatasetColumn]]]
class CustomDescriptor(Descriptor):
func: str
_func: Optional[CustomDescriptorCallable] = PrivateAttr(None)
def __init__(
self, func: Union[str, CustomDescriptorCallable],
alias: Optional[str] = None, tests: Optional[List[AnyDescriptorTest]] = None,
):
def generate_data(self, dataset: "Dataset", options: Options) -> Union[DatasetColumn, Dict[str, DatasetColumn]]:
Import
from evidently.descriptors._custom_descriptors import CustomColumnDescriptor, CustomDescriptor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| column_name | str | Yes (CustomColumnDescriptor) | Name of the column to apply the function to |
| func | Union[str, Callable] | Yes | Custom function or fully qualified function name string |
| alias | Optional[str] | No | Custom display name for the descriptor; auto-generated from func if not provided |
| tests | Optional[List[AnyDescriptorTest]] | No | Tests to apply to the computed descriptor values |
Outputs
| Name | Type | Description |
|---|---|---|
| CustomColumnDescriptor.generate_data return | Union[DatasetColumn, Dict[str, DatasetColumn]] | Transformed column data from the custom function |
| CustomDescriptor.generate_data return | Union[DatasetColumn, Dict[str, DatasetColumn]] | Transformed data from the custom function applied to the full dataset |
Usage Examples
from evidently.descriptors._custom_descriptors import CustomColumnDescriptor, CustomDescriptor
from evidently.core.datasets import DatasetColumn
# Custom column descriptor with a lambda-like function
def uppercase_transform(column: DatasetColumn) -> DatasetColumn:
return DatasetColumn(
type=column.type,
data=column.data.str.upper(),
)
col_descriptor = CustomColumnDescriptor(
column_name="text",
func=uppercase_transform,
alias="uppercase_text",
)
# Custom descriptor operating on the full dataset
def word_ratio(dataset):
df = dataset.as_dataframe()
ratio = df["response"].str.len() / df["question"].str.len()
return DatasetColumn(type="numerical", data=ratio)
full_descriptor = CustomDescriptor(
func=word_ratio,
alias="response_question_length_ratio",
)
# Add descriptors to dataset
dataset.add_descriptors([col_descriptor, full_descriptor])
Related Pages
- Environment:Evidentlyai_Evidently_Python_Core_Environment
- Evidentlyai_Evidently_Context_Relevance - Built-in descriptor for context relevance scoring
- Evidentlyai_Evidently_Text_Length_Descriptor - Built-in descriptor for text length