Implementation:Evidentlyai Evidently Legacy OpenAI Feature

Knowledge Sources	Evidentlyai_Evidently
Domains	ML Monitoring, LLM Evaluation, OpenAI Integration
Last Updated	2026-02-14 12:00 GMT

Overview

Provides a generated feature that uses OpenAI models to evaluate text data via customizable prompts, supporting both legacy completion and chat completion APIs with configurable response post-processing.

Description

The OpenAIFeature class extends both FeatureTypeFieldMixin and GeneratedFeature to produce features by sending text data to OpenAI models with custom prompts. It supports two API modes based on the model:

Legacy completions API (for models: gpt-3.5-turbo-instruct, babbage-002, davinci-002): Uses client.completions.create with a single formatted prompt string.
Chat completions API (all other models): Uses client.chat.completions.create with a system message (context) and a user message (formatted prompt).

Key features of the class:

Prompt templating: The prompt string contains a prompt_replace_string (default: "REPLACE") that is substituted with the actual text value, and a context_replace_string (default: "CONTEXT") that is substituted with optional context.
Context support: Context can be provided as a static string (context) or from a DataFrame column (context_column). These are mutually exclusive.
Response post-processing: The _postprocess_response function processes the LLM's response based on check_mode and possible_values. It supports modes like "any_line" (checks each line) and "contains" (checks if a possible value is contained in the line). If possible_values is set, the response is matched against these values.
Feature type handling: For categorical features, the post-processed string is returned directly. For numerical features, the post-processed response is cast to a float, with None returned on failure.
Unique feature IDs: Each instance generates a unique feature_id via new_id() to ensure column name uniqueness.

Usage

Use this feature when you need to evaluate or classify text data using OpenAI models within Evidently monitoring pipelines. This is a legacy feature class; for newer implementations, consider using LLMJudge which provides a more structured template-based approach.

Code Reference

Source Location

Repository: Evidentlyai_Evidently
File: src/evidently/legacy/features/openai_feature.py

Signature

class OpenAIFeature(FeatureTypeFieldMixin, GeneratedFeature):
    class Config:
        type_alias = "evidently:feature:OpenAIFeature"

    column_name: str
    feature_id: str
    prompt: str
    prompt_replace_string: str
    context: Optional[str]
    context_column: Optional[str]
    context_replace_string: str
    openai_params: dict
    model: str
    check_mode: str
    possible_values: Optional[List[str]]

    def __init__(
        self,
        column_name: str,
        model: str,
        prompt: str,
        feature_type: str,
        context: Optional[str] = None,
        context_column: Optional[str] = None,
        prompt_replace_string: str = "REPLACE",
        context_replace_string: str = "CONTEXT",
        check_mode: str = "any_line",
        possible_values: Optional[List[str]] = None,
        openai_params: Optional[dict] = None,
        display_name: Optional[str] = None,
    ): ...
    def generate_feature(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.DataFrame: ...
    def _as_column(self) -> ColumnName: ...
    def _feature_column_name(self) -> str: ...

Import

from evidently.legacy.features.openai_feature import OpenAIFeature

I/O Contract

Inputs

Name	Type	Required	Description
column_name	str	Yes	Name of the text column in the DataFrame to evaluate
model	str	Yes	OpenAI model name (e.g., "gpt-4", "gpt-3.5-turbo")
prompt	str	Yes	Prompt template string with a placeholder for the text value
feature_type	str	Yes	Output feature type: "cat" for categorical, anything else for numerical
context	Optional[str]	No	Static context string (mutually exclusive with context_column)
context_column	Optional[str]	No	Name of a DataFrame column to use as per-row context (mutually exclusive with context)
prompt_replace_string	str	No	Placeholder in the prompt to replace with the text value (default: "REPLACE")
context_replace_string	str	No	Placeholder in the prompt to replace with context (default: "CONTEXT")
check_mode	str	No	Response parsing mode: "any_line", "any_line_contains", "first_line", "first_line_contains" (default: "any_line")
possible_values	Optional[List[str]]	No	List of valid response values to match against (case-insensitive)
openai_params	Optional[dict]	No	Additional parameters to pass to the OpenAI API call
display_name	Optional[str]	No	Custom display name for the feature

Outputs

Name	Type	Description
return	pd.DataFrame	A single-column DataFrame with string values (categorical) or float values (numerical), or None for unparseable responses

Usage Examples

from evidently.legacy.features.openai_feature import OpenAIFeature

# Categorical classification with possible values
sentiment_feature = OpenAIFeature(
    column_name="review",
    model="gpt-4",
    prompt="Classify the sentiment of the following text as positive, negative, or neutral: REPLACE",
    feature_type="cat",
    possible_values=["positive", "negative", "neutral"],
    context="You are a sentiment analysis expert.",
    display_name="Sentiment"
)

# Numerical scoring
quality_feature = OpenAIFeature(
    column_name="response",
    model="gpt-4",
    prompt="Rate the quality of the following response on a scale of 1-10: REPLACE",
    feature_type="num",
    display_name="Quality Score"
)

# With context column
relevance_feature = OpenAIFeature(
    column_name="answer",
    model="gpt-4",
    prompt="Given the context CONTEXT, is the following answer relevant? REPLACE",
    feature_type="cat",
    context_column="question",
    possible_values=["yes", "no"],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment