Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Legacy OpenAI Feature

From Leeroopedia
Knowledge Sources
Domains ML Monitoring, LLM Evaluation, OpenAI Integration
Last Updated 2026-02-14 12:00 GMT

Overview

Provides a generated feature that uses OpenAI models to evaluate text data via customizable prompts, supporting both legacy completion and chat completion APIs with configurable response post-processing.

Description

The OpenAIFeature class extends both FeatureTypeFieldMixin and GeneratedFeature to produce features by sending text data to OpenAI models with custom prompts. It supports two API modes based on the model:

  • Legacy completions API (for models: gpt-3.5-turbo-instruct, babbage-002, davinci-002): Uses client.completions.create with a single formatted prompt string.
  • Chat completions API (all other models): Uses client.chat.completions.create with a system message (context) and a user message (formatted prompt).

Key features of the class:

  • Prompt templating: The prompt string contains a prompt_replace_string (default: "REPLACE") that is substituted with the actual text value, and a context_replace_string (default: "CONTEXT") that is substituted with optional context.
  • Context support: Context can be provided as a static string (context) or from a DataFrame column (context_column). These are mutually exclusive.
  • Response post-processing: The _postprocess_response function processes the LLM's response based on check_mode and possible_values. It supports modes like "any_line" (checks each line) and "contains" (checks if a possible value is contained in the line). If possible_values is set, the response is matched against these values.
  • Feature type handling: For categorical features, the post-processed string is returned directly. For numerical features, the post-processed response is cast to a float, with None returned on failure.
  • Unique feature IDs: Each instance generates a unique feature_id via new_id() to ensure column name uniqueness.

Usage

Use this feature when you need to evaluate or classify text data using OpenAI models within Evidently monitoring pipelines. This is a legacy feature class; for newer implementations, consider using LLMJudge which provides a more structured template-based approach.

Code Reference

Source Location

Signature

class OpenAIFeature(FeatureTypeFieldMixin, GeneratedFeature):
    class Config:
        type_alias = "evidently:feature:OpenAIFeature"

    column_name: str
    feature_id: str
    prompt: str
    prompt_replace_string: str
    context: Optional[str]
    context_column: Optional[str]
    context_replace_string: str
    openai_params: dict
    model: str
    check_mode: str
    possible_values: Optional[List[str]]

    def __init__(
        self,
        column_name: str,
        model: str,
        prompt: str,
        feature_type: str,
        context: Optional[str] = None,
        context_column: Optional[str] = None,
        prompt_replace_string: str = "REPLACE",
        context_replace_string: str = "CONTEXT",
        check_mode: str = "any_line",
        possible_values: Optional[List[str]] = None,
        openai_params: Optional[dict] = None,
        display_name: Optional[str] = None,
    ): ...
    def generate_feature(self, data: pd.DataFrame, data_definition: DataDefinition) -> pd.DataFrame: ...
    def _as_column(self) -> ColumnName: ...
    def _feature_column_name(self) -> str: ...

Import

from evidently.legacy.features.openai_feature import OpenAIFeature

I/O Contract

Inputs

Name Type Required Description
column_name str Yes Name of the text column in the DataFrame to evaluate
model str Yes OpenAI model name (e.g., "gpt-4", "gpt-3.5-turbo")
prompt str Yes Prompt template string with a placeholder for the text value
feature_type str Yes Output feature type: "cat" for categorical, anything else for numerical
context Optional[str] No Static context string (mutually exclusive with context_column)
context_column Optional[str] No Name of a DataFrame column to use as per-row context (mutually exclusive with context)
prompt_replace_string str No Placeholder in the prompt to replace with the text value (default: "REPLACE")
context_replace_string str No Placeholder in the prompt to replace with context (default: "CONTEXT")
check_mode str No Response parsing mode: "any_line", "any_line_contains", "first_line", "first_line_contains" (default: "any_line")
possible_values Optional[List[str]] No List of valid response values to match against (case-insensitive)
openai_params Optional[dict] No Additional parameters to pass to the OpenAI API call
display_name Optional[str] No Custom display name for the feature

Outputs

Name Type Description
return pd.DataFrame A single-column DataFrame with string values (categorical) or float values (numerical), or None for unparseable responses

Usage Examples

from evidently.legacy.features.openai_feature import OpenAIFeature

# Categorical classification with possible values
sentiment_feature = OpenAIFeature(
    column_name="review",
    model="gpt-4",
    prompt="Classify the sentiment of the following text as positive, negative, or neutral: REPLACE",
    feature_type="cat",
    possible_values=["positive", "negative", "neutral"],
    context="You are a sentiment analysis expert.",
    display_name="Sentiment"
)

# Numerical scoring
quality_feature = OpenAIFeature(
    column_name="response",
    model="gpt-4",
    prompt="Rate the quality of the following response on a scale of 1-10: REPLACE",
    feature_type="num",
    display_name="Quality Score"
)

# With context column
relevance_feature = OpenAIFeature(
    column_name="answer",
    model="gpt-4",
    prompt="Given the context CONTEXT, is the following answer relevant? REPLACE",
    feature_type="cat",
    context_column="question",
    possible_values=["yes", "no"],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment