Implementation:Deepset ai Haystack OpenAIGenerator

Overview

OpenAIGenerator is a Haystack component that generates text using OpenAI's large language models. It is a wrapper around the OpenAI Chat Completions API, accepting a simple string prompt as input and returning generated text replies with associated metadata. It supports the gpt-4 and gpt-5 series models and provides streaming response capabilities.

Source Location

File: haystack/components/generators/openai.py (Lines 32-270)
Class: OpenAIGenerator
Component decorator: @component

Import

from haystack.components.generators import OpenAIGenerator

External Dependencies

openai (Python package): Provides the OpenAI client, ChatCompletion, ChatCompletionChunk, and Stream types for interacting with the OpenAI API.

Constructor

def __init__(
    self,
    api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
    model: str = "gpt-5-mini",
    streaming_callback: StreamingCallbackT | None = None,
    api_base_url: str | None = None,
    organization: str | None = None,
    system_prompt: str | None = None,
    generation_kwargs: dict[str, Any] | None = None,
    timeout: float | None = None,
    max_retries: int | None = None,
    http_client_kwargs: dict[str, Any] | None = None,
)

Parameters

api_key (Secret): The OpenAI API key. Defaults to reading from the OPENAI_API_KEY environment variable.
model (str): The model name to use. Defaults to "gpt-5-mini".
streaming_callback (StreamingCallbackT | None): Callback function invoked for each new token during streaming. Receives a StreamingChunk argument.
api_base_url (str | None): Optional custom base URL for the API (useful for proxies or compatible endpoints).
organization (str | None): Optional OpenAI Organization ID.
system_prompt (str | None): Optional system prompt prepended to all requests. If not provided, the model's default system prompt is used.
generation_kwargs (dict[str, Any] | None): Additional parameters passed directly to the OpenAI API (e.g., temperature, top_p, max_completion_tokens, stop, presence_penalty, frequency_penalty, logit_bias, n).
timeout (float | None): Request timeout in seconds. Defaults to the OPENAI_TIMEOUT environment variable or 30 seconds.
max_retries (int | None): Maximum retry attempts on internal errors. Defaults to the OPENAI_MAX_RETRIES environment variable or 5.
http_client_kwargs (dict[str, Any] | None): Keyword arguments for configuring a custom httpx.Client.

Initialization Behavior

Resolves the API key from the Secret object.
Reads timeout and max_retries from environment variables if not explicitly provided.
Instantiates an OpenAI client with the configured parameters and an optional custom HTTP client.

Run Method

@component.output_types(replies=list[str], meta=list[dict[str, Any]])
def run(
    self,
    prompt: str,
    system_prompt: str | None = None,
    streaming_callback: StreamingCallbackT | None = None,
    generation_kwargs: dict[str, Any] | None = None,
) -> dict:  # Returns {"replies": list[str], "meta": list[dict]}

Parameters

prompt (str): The input prompt string for text generation.
system_prompt (str | None): Optional runtime system prompt. Overrides the initialization system prompt if provided.
streaming_callback (StreamingCallbackT | None): Optional runtime streaming callback. Overrides the initialization callback if provided.
generation_kwargs (dict[str, Any] | None): Optional runtime generation parameters. Merged with (and override) the initialization parameters.

Returns

{"replies": list[str], "meta": list[dict]}: A dictionary containing:
- replies: A list of generated text strings (one per completion).
- meta: A list of metadata dictionaries, each containing model, index, finish_reason, and usage information.

Behavior

Constructs a ChatMessage from the user prompt. Prepends a system message if a system prompt is available (runtime > init > none).
Merges initialization and runtime generation_kwargs.
Selects the streaming callback (runtime > init > none).
Converts messages to OpenAI's expected format via to_openai_dict_format().
Calls client.chat.completions.create() with streaming enabled if a callback is present.
For streaming: Processes each chunk through the callback, then assembles chunks into a final ChatMessage. Streaming is limited to n=1.
For non-streaming: Converts each Choice in the completion response to a ChatMessage.
Checks finish reasons and logs warnings for truncation (length) or content filtering (content_filter).
Extracts text and metadata from the ChatMessage objects and returns them as separate lists.

Serialization

def to_dict(self) -> dict[str, Any]

@classmethod
def from_dict(cls, data: dict[str, Any]) -> "OpenAIGenerator"

Supports full serialization and deserialization. The streaming callback is serialized by name and deserialized back to a callable.

Usage Example

from haystack.components.generators import OpenAIGenerator

client = OpenAIGenerator()
response = client.run("What's Natural Language Processing? Be brief.")
print(response)
# {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence...'],
#  'meta': [{'model': 'gpt-5-mini', 'index': 0, 'finish_reason': 'stop',
#            'usage': {'prompt_tokens': 16, 'completion_tokens': 49, 'total_tokens': 65}}]}

API Wrapper Note

This component is a wrapper around the OpenAI Chat Completions API. It translates Haystack's string-based prompt interface into OpenAI's message-based API format internally. The prompt string is wrapped as a user ChatMessage, and any system prompt is prepended as a system ChatMessage, before being sent to the chat.completions.create endpoint.

Related Pages

Principle:Deepset_ai_Haystack_LLM_Text_Generation

Requires Environment

Environment:Deepset_ai_Haystack_OpenAI_API_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment