Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Legacy Feature Generator

From Leeroopedia
Knowledge Sources
Domains ML Monitoring, Feature Engineering, Pipeline Orchestration
Last Updated 2026-02-14 12:00 GMT

Overview

Orchestrates the computation of additional generated features by running a list of GeneratedFeatures instances through the Evidently calculation engine with reference and current datasets.

Description

The FeatureGenerator class extends Runnable and serves as the top-level entry point for computing generated features outside of a full report or test suite context. It wraps an internal Suite instance to leverage the existing engine infrastructure.

The run() method performs the following steps:

  1. Validates that current_data is provided (raises ValueError if None).
  2. Resets the internal suite and sets the calculation engine (defaulting to PythonEngine if none is specified).
  3. Derives a DataDefinition from the current and reference data using the column mapping and configured categorical cardinality options.
  4. Packages the data into a GenericInputData object and converts it via the engine.
  5. Calls engine.calculate_additional_features() with the converted data, the list of generated features, and the suite options.
  6. Stores the results in the internal suite context.

The get_features() method retrieves computed feature results. When called with a specific GeneratedFeatures instance, it returns an EngineDatasets object containing the feature data for both current and reference datasets. When called without arguments, it merges all computed features together via the engine.

Usage

Use FeatureGenerator when you need to compute generated features independently of a full Evidently report or test suite. This is useful for feature engineering workflows where you want to enrich datasets with computed features (e.g., text length, OOV percentage, BERTScore) before feeding them into downstream analysis or ML pipelines.

Code Reference

Source Location

Signature

class FeatureGenerator(Runnable):
    _inner_suite: Suite

    def __init__(self, features: List[GeneratedFeatures], options: AnyOptions = None): ...

    def run(
        self,
        *,
        reference_data,
        current_data,
        column_mapping: Optional[ColumnMapping] = None,
        engine: Optional[Type[Engine]] = None,
        additional_data: Dict[str, Any] = None,
        timestamp: Optional[datetime] = None,
    ) -> None: ...

    def get_features(self, feature: Optional[GeneratedFeatures] = None) -> EngineDatasets[Any]: ...

Import

from evidently.legacy.features.feature_generator import FeatureGenerator

I/O Contract

Inputs

__init__:

Name Type Required Description
features List[GeneratedFeatures] Yes A list of GeneratedFeatures instances to compute.
options AnyOptions No Configuration options for the suite engine and data definition. Defaults to None.

run():

Name Type Required Description
reference_data Any (typically pd.DataFrame) No The reference/baseline dataset. Can be None.
current_data Any (typically pd.DataFrame) Yes The current dataset to compute features on. Must not be None.
column_mapping Optional[ColumnMapping] No Column mapping configuration. Defaults to a new ColumnMapping().
engine Optional[Type[Engine]] No The calculation engine class to use. Defaults to PythonEngine.
additional_data Dict[str, Any] No Additional data to pass to the engine. Defaults to an empty dict.
timestamp Optional[datetime] No Optional timestamp for the run.

get_features():

Name Type Required Description
feature Optional[GeneratedFeatures] No A specific feature to retrieve. If None, all features are merged.

Outputs

Name Type Description
run() return None The method stores results internally; retrieve them via get_features().
get_features() return EngineDatasets[Any] An EngineDatasets object with current and reference attributes containing the computed feature data.

Usage Examples

import pandas as pd
from evidently.legacy.features.feature_generator import FeatureGenerator
from evidently.legacy.features.exact_match_feature import ExactMatchFeature
from evidently.legacy.features.contains_link_feature import ContainsLink

# Define features to compute
features = [
    ExactMatchFeature(columns=["prediction", "target"]),
    ContainsLink(column_name="response"),
]

# Create the generator and run
generator = FeatureGenerator(features=features)
generator.run(
    reference_data=reference_df,
    current_data=current_df,
)

# Retrieve all computed features merged together
result = generator.get_features()
current_with_features = result.current
reference_with_features = result.reference

# Retrieve a specific feature's results
match_result = generator.get_features(features[0])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment