Implementation:Deepset ai Haystack OpenAIGenerator
Overview
OpenAIGenerator is a Haystack component that generates text using OpenAI's large language models. It is a wrapper around the OpenAI Chat Completions API, accepting a simple string prompt as input and returning generated text replies with associated metadata. It supports the gpt-4 and gpt-5 series models and provides streaming response capabilities.
Source Location
- File:
haystack/components/generators/openai.py(Lines 32-270) - Class:
OpenAIGenerator - Component decorator:
@component
Import
from haystack.components.generators import OpenAIGenerator
External Dependencies
- openai (Python package): Provides the
OpenAIclient,ChatCompletion,ChatCompletionChunk, andStreamtypes for interacting with the OpenAI API.
Constructor
def __init__(
self,
api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
model: str = "gpt-5-mini",
streaming_callback: StreamingCallbackT | None = None,
api_base_url: str | None = None,
organization: str | None = None,
system_prompt: str | None = None,
generation_kwargs: dict[str, Any] | None = None,
timeout: float | None = None,
max_retries: int | None = None,
http_client_kwargs: dict[str, Any] | None = None,
)
Parameters
- api_key (
Secret): The OpenAI API key. Defaults to reading from theOPENAI_API_KEYenvironment variable. - model (
str): The model name to use. Defaults to"gpt-5-mini". - streaming_callback (
StreamingCallbackT | None): Callback function invoked for each new token during streaming. Receives aStreamingChunkargument. - api_base_url (
str | None): Optional custom base URL for the API (useful for proxies or compatible endpoints). - organization (
str | None): Optional OpenAI Organization ID. - system_prompt (
str | None): Optional system prompt prepended to all requests. If not provided, the model's default system prompt is used. - generation_kwargs (
dict[str, Any] | None): Additional parameters passed directly to the OpenAI API (e.g.,temperature,top_p,max_completion_tokens,stop,presence_penalty,frequency_penalty,logit_bias,n). - timeout (
float | None): Request timeout in seconds. Defaults to theOPENAI_TIMEOUTenvironment variable or 30 seconds. - max_retries (
int | None): Maximum retry attempts on internal errors. Defaults to theOPENAI_MAX_RETRIESenvironment variable or 5. - http_client_kwargs (
dict[str, Any] | None): Keyword arguments for configuring a customhttpx.Client.
Initialization Behavior
- Resolves the API key from the
Secretobject. - Reads timeout and max_retries from environment variables if not explicitly provided.
- Instantiates an
OpenAIclient with the configured parameters and an optional custom HTTP client.
Run Method
@component.output_types(replies=list[str], meta=list[dict[str, Any]])
def run(
self,
prompt: str,
system_prompt: str | None = None,
streaming_callback: StreamingCallbackT | None = None,
generation_kwargs: dict[str, Any] | None = None,
) -> dict: # Returns {"replies": list[str], "meta": list[dict]}
Parameters
- prompt (
str): The input prompt string for text generation. - system_prompt (
str | None): Optional runtime system prompt. Overrides the initialization system prompt if provided. - streaming_callback (
StreamingCallbackT | None): Optional runtime streaming callback. Overrides the initialization callback if provided. - generation_kwargs (
dict[str, Any] | None): Optional runtime generation parameters. Merged with (and override) the initialization parameters.
Returns
{"replies": list[str], "meta": list[dict]}: A dictionary containing:- replies: A list of generated text strings (one per completion).
- meta: A list of metadata dictionaries, each containing
model,index,finish_reason, andusageinformation.
Behavior
- Constructs a
ChatMessagefrom the user prompt. Prepends a system message if a system prompt is available (runtime > init > none). - Merges initialization and runtime
generation_kwargs. - Selects the streaming callback (runtime > init > none).
- Converts messages to OpenAI's expected format via
to_openai_dict_format(). - Calls
client.chat.completions.create()with streaming enabled if a callback is present. - For streaming: Processes each chunk through the callback, then assembles chunks into a final
ChatMessage. Streaming is limited ton=1. - For non-streaming: Converts each
Choicein the completion response to aChatMessage. - Checks finish reasons and logs warnings for truncation (
length) or content filtering (content_filter). - Extracts text and metadata from the
ChatMessageobjects and returns them as separate lists.
Serialization
def to_dict(self) -> dict[str, Any]
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "OpenAIGenerator"
Supports full serialization and deserialization. The streaming callback is serialized by name and deserialized back to a callable.
Usage Example
from haystack.components.generators import OpenAIGenerator
client = OpenAIGenerator()
response = client.run("What's Natural Language Processing? Be brief.")
print(response)
# {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence...'],
# 'meta': [{'model': 'gpt-5-mini', 'index': 0, 'finish_reason': 'stop',
# 'usage': {'prompt_tokens': 16, 'completion_tokens': 49, 'total_tokens': 65}}]}
API Wrapper Note
This component is a wrapper around the OpenAI Chat Completions API. It translates Haystack's string-based prompt interface into OpenAI's message-based API format internally. The prompt string is wrapped as a user ChatMessage, and any system prompt is prepended as a system ChatMessage, before being sent to the chat.completions.create endpoint.