Workflow:Openai Openai node Chat Completion

Knowledge Sources	OpenAI Node SDK OpenAI API Reference Chat Completions Guide
Domains	LLMs, API_Integration, Streaming
Last Updated	2026-02-15 12:00 GMT

Overview

End-to-end process for generating text from OpenAI language models using the Chat Completions API and the newer Responses API, with support for both synchronous and streaming modes.

Description

This workflow covers the fundamental pattern for interacting with OpenAI's language models through the Node.js SDK. It demonstrates how to initialize the client, construct message arrays with role-based prompts, send requests to the Chat Completions or Responses endpoints, and process the returned text. The workflow supports two consumption modes: non-streaming (where the full response is returned at once) and streaming (where partial results arrive as Server-Sent Events). The SDK provides both a low-level streaming interface using async iterators and a higher-level ChatCompletionStream helper with event-driven callbacks for content deltas, messages, and completion.

Usage

Execute this workflow when you need to generate text from an OpenAI model in a Node.js or TypeScript application. This is the most common use case for the SDK and applies to chatbots, content generation, question answering, code generation, and any scenario where you send a prompt and receive a text response. Use streaming mode when you need to display partial results to users in real time (e.g., chat interfaces), and non-streaming mode for batch processing or when latency tolerance is high.

Execution Steps

Step 1: Client Initialization

Instantiate the OpenAI client with authentication credentials. The client reads the OPENAI_API_KEY environment variable by default, but you can also pass it explicitly. Configure optional settings such as timeout, retry behavior, base URL, and logging level at this stage.

Key considerations:

The API key can be provided via environment variable or constructor parameter
Default timeout is 10 minutes; configure lower for interactive applications
Default retry count is 2 with exponential backoff for transient errors
For Azure deployments, use the AzureOpenAI subclass instead

Step 2: Request Construction

Build the request payload specifying the model, messages array, and optional parameters. Messages follow a role-based structure with system/developer, user, and assistant roles that establish the conversation context.

Key considerations:

The model parameter selects which language model to use
Messages are ordered chronologically; system/developer messages set behavior
Optional parameters include temperature, max_tokens, top_p, stop sequences
Set stream: true to receive Server-Sent Events instead of a single response
For the Responses API, use input (string) instead of messages (array)

Step 3: API Invocation

Send the request to either the Chat Completions endpoint or the Responses endpoint. For non-streaming requests, the SDK returns a promise that resolves to the complete response object. For streaming requests, the SDK returns an async iterable of server-sent events.

What happens:

The SDK constructs an HTTP POST request with JSON body
Authentication headers are added automatically
The request is sent via the configured fetch implementation
On transient failures (429, 5xx), the SDK retries automatically with backoff

Step 4: Response Processing

Extract the generated text from the response. For Chat Completions, access completion.choices[0].message.content. For Responses, access response.output_text. In streaming mode, iterate over chunks and accumulate partial content deltas.

Non-streaming processing:

Access the first choice's message content directly
Check finish_reason to understand why generation stopped
Access _request_id for debugging and logging

Streaming processing:

Iterate with for await...of to receive chunks as they arrive
Each chunk contains a delta with partial content
Use the higher-level .stream() helper for event-driven patterns with on('content'), on('message') callbacks
Call .finalChatCompletion() or .finalResponse() to get the accumulated result

Execution Diagram

GitHub URL

Workflow Repository