Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Legacy RegExp Feature

From Leeroopedia
Revision as of 12:28, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Evidentlyai_Evidently_Legacy_RegExp_Feature.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains ML Monitoring, Text Analysis, Pattern Matching
Last Updated 2026-02-14 12:00 GMT

Overview

Provides a generated feature that checks whether each value in a specified text column fully matches a given regular expression pattern.

Description

The RegExp class extends GeneratedFeature to produce a categorical feature indicating whether text values fully match a regular expression. The matching is performed using pandas' str.fullmatch method, which requires the entire string to match the pattern (not just a substring). The result is cast to an integer: 1 for a full match, 0 for no match.

The column values are first cast to strings using astype(str) before the match is applied. The feature column name follows the pattern {column_name}_{reg_exp} and the default display name is "RegExp '{reg_exp}' Match for column {column_name}".

The feature type is ColumnType.Categorical.

Usage

Use this feature when you need to validate that text values conform to a specific pattern, such as email formats, phone numbers, identifiers, or any structured text format. It is useful for data quality monitoring and output validation in ML pipelines.

Code Reference

Source Location

Signature

class RegExp(GeneratedFeature):
    class Config:
        type_alias = "evidently:feature:RegExp"

    __feature_type__: ClassVar = ColumnType.Categorical
    column_name: str
    reg_exp: str

    def __init__(self, column_name: str, reg_exp: str, display_name: Optional[str] = None): ...
    def generate_feature(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.DataFrame: ...
    def _feature_name(self): ...
    def _as_column(self) -> ColumnName: ...

Import

from evidently.legacy.features.regexp_feature import RegExp

I/O Contract

Inputs

Name Type Required Description
column_name str Yes Name of the text column in the DataFrame to match against
reg_exp str Yes Regular expression pattern for full-string matching
display_name Optional[str] No Custom display name for the feature

Outputs

Name Type Description
return pd.DataFrame A single-column DataFrame with integer values: 1 if the value fully matches the regex, 0 otherwise

Usage Examples

from evidently.legacy.features.regexp_feature import RegExp

# Check if values match an email pattern
email_feature = RegExp(
    column_name="email",
    reg_exp=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}",
    display_name="Valid Email Format"
)

# Check if values are numeric strings
numeric_feature = RegExp(
    column_name="response",
    reg_exp=r"\d+(\.\d+)?",
    display_name="Numeric Response"
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment