Implementation:Evidentlyai Evidently Legacy RegExp Feature
| Knowledge Sources | |
|---|---|
| Domains | ML Monitoring, Text Analysis, Pattern Matching |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Provides a generated feature that checks whether each value in a specified text column fully matches a given regular expression pattern.
Description
The RegExp class extends GeneratedFeature to produce a categorical feature indicating whether text values fully match a regular expression. The matching is performed using pandas' str.fullmatch method, which requires the entire string to match the pattern (not just a substring). The result is cast to an integer: 1 for a full match, 0 for no match.
The column values are first cast to strings using astype(str) before the match is applied. The feature column name follows the pattern {column_name}_{reg_exp} and the default display name is "RegExp '{reg_exp}' Match for column {column_name}".
The feature type is ColumnType.Categorical.
Usage
Use this feature when you need to validate that text values conform to a specific pattern, such as email formats, phone numbers, identifiers, or any structured text format. It is useful for data quality monitoring and output validation in ML pipelines.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File: src/evidently/legacy/features/regexp_feature.py
Signature
class RegExp(GeneratedFeature):
class Config:
type_alias = "evidently:feature:RegExp"
__feature_type__: ClassVar = ColumnType.Categorical
column_name: str
reg_exp: str
def __init__(self, column_name: str, reg_exp: str, display_name: Optional[str] = None): ...
def generate_feature(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.DataFrame: ...
def _feature_name(self): ...
def _as_column(self) -> ColumnName: ...
Import
from evidently.legacy.features.regexp_feature import RegExp
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| column_name | str | Yes | Name of the text column in the DataFrame to match against |
| reg_exp | str | Yes | Regular expression pattern for full-string matching |
| display_name | Optional[str] | No | Custom display name for the feature |
Outputs
| Name | Type | Description |
|---|---|---|
| return | pd.DataFrame | A single-column DataFrame with integer values: 1 if the value fully matches the regex, 0 otherwise |
Usage Examples
from evidently.legacy.features.regexp_feature import RegExp
# Check if values match an email pattern
email_feature = RegExp(
column_name="email",
reg_exp=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}",
display_name="Valid Email Format"
)
# Check if values are numeric strings
numeric_feature = RegExp(
column_name="response",
reg_exp=r"\d+(\.\d+)?",
display_name="Numeric Response"
)