Implementation:Evidentlyai Evidently Legacy Feature Generator
| Knowledge Sources | |
|---|---|
| Domains | ML Monitoring, Feature Engineering, Pipeline Orchestration |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Orchestrates the computation of additional generated features by running a list of GeneratedFeatures instances through the Evidently calculation engine with reference and current datasets.
Description
The FeatureGenerator class extends Runnable and serves as the top-level entry point for computing generated features outside of a full report or test suite context. It wraps an internal Suite instance to leverage the existing engine infrastructure.
The run() method performs the following steps:
- Validates that current_data is provided (raises ValueError if None).
- Resets the internal suite and sets the calculation engine (defaulting to PythonEngine if none is specified).
- Derives a DataDefinition from the current and reference data using the column mapping and configured categorical cardinality options.
- Packages the data into a GenericInputData object and converts it via the engine.
- Calls engine.calculate_additional_features() with the converted data, the list of generated features, and the suite options.
- Stores the results in the internal suite context.
The get_features() method retrieves computed feature results. When called with a specific GeneratedFeatures instance, it returns an EngineDatasets object containing the feature data for both current and reference datasets. When called without arguments, it merges all computed features together via the engine.
Usage
Use FeatureGenerator when you need to compute generated features independently of a full Evidently report or test suite. This is useful for feature engineering workflows where you want to enrich datasets with computed features (e.g., text length, OOV percentage, BERTScore) before feeding them into downstream analysis or ML pipelines.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/legacy/features/feature_generator.py
Signature
class FeatureGenerator(Runnable):
_inner_suite: Suite
def __init__(self, features: List[GeneratedFeatures], options: AnyOptions = None): ...
def run(
self,
*,
reference_data,
current_data,
column_mapping: Optional[ColumnMapping] = None,
engine: Optional[Type[Engine]] = None,
additional_data: Dict[str, Any] = None,
timestamp: Optional[datetime] = None,
) -> None: ...
def get_features(self, feature: Optional[GeneratedFeatures] = None) -> EngineDatasets[Any]: ...
Import
from evidently.legacy.features.feature_generator import FeatureGenerator
I/O Contract
Inputs
__init__:
| Name | Type | Required | Description |
|---|---|---|---|
| features | List[GeneratedFeatures] | Yes | A list of GeneratedFeatures instances to compute. |
| options | AnyOptions | No | Configuration options for the suite engine and data definition. Defaults to None. |
run():
| Name | Type | Required | Description |
|---|---|---|---|
| reference_data | Any (typically pd.DataFrame) | No | The reference/baseline dataset. Can be None. |
| current_data | Any (typically pd.DataFrame) | Yes | The current dataset to compute features on. Must not be None. |
| column_mapping | Optional[ColumnMapping] | No | Column mapping configuration. Defaults to a new ColumnMapping(). |
| engine | Optional[Type[Engine]] | No | The calculation engine class to use. Defaults to PythonEngine. |
| additional_data | Dict[str, Any] | No | Additional data to pass to the engine. Defaults to an empty dict. |
| timestamp | Optional[datetime] | No | Optional timestamp for the run. |
get_features():
| Name | Type | Required | Description |
|---|---|---|---|
| feature | Optional[GeneratedFeatures] | No | A specific feature to retrieve. If None, all features are merged. |
Outputs
| Name | Type | Description |
|---|---|---|
| run() return | None | The method stores results internally; retrieve them via get_features(). |
| get_features() return | EngineDatasets[Any] | An EngineDatasets object with current and reference attributes containing the computed feature data. |
Usage Examples
import pandas as pd
from evidently.legacy.features.feature_generator import FeatureGenerator
from evidently.legacy.features.exact_match_feature import ExactMatchFeature
from evidently.legacy.features.contains_link_feature import ContainsLink
# Define features to compute
features = [
ExactMatchFeature(columns=["prediction", "target"]),
ContainsLink(column_name="response"),
]
# Create the generator and run
generator = FeatureGenerator(features=features)
generator.run(
reference_data=reference_df,
current_data=current_df,
)
# Retrieve all computed features merged together
result = generator.get_features()
current_with_features = result.current
reference_with_features = result.reference
# Retrieve a specific feature's results
match_result = generator.get_features(features[0])
Related Pages
- Environment:Evidentlyai_Evidently_Python_Core_Environment
- Evidentlyai_Evidently_Legacy_Generated_Features - The GeneratedFeatures base class that defines the feature computation interface.
- Evidentlyai_Evidently_Legacy_Custom_Feature - Custom feature classes that can be used with FeatureGenerator.
- Evidentlyai_Evidently_Legacy_BERTScore_Feature - An example of a GeneratedFeature that can be computed via FeatureGenerator.