Principle:Openai Openai python Chat Response Processing

Knowledge Sources	OpenAI Chat API Reference openai-python
Domains	NLP, Text_Generation
Last Updated	2026-02-15 00:00 GMT

Overview

A data extraction pattern for consuming language model outputs from both complete responses and incremental streaming chunks.

Description

Response processing handles the extraction of generated content from the API's response objects. For non-streaming requests, the ChatCompletion object contains complete choices with message content, tool calls, and usage statistics. For streaming requests, a Stream iterator yields ChatCompletionChunk objects with incremental deltas that must be accumulated.

Key extraction points include: text content from choices[0].message.content, tool calls from choices[0].message.tool_calls, finish reason indicating why generation stopped, and token usage for cost tracking.

Usage

Use this principle after every Chat Completion request to extract the generated content. Choose the appropriate access pattern based on whether streaming was enabled. Always check finish_reason to determine if the response was complete or truncated.

Theoretical Basis

Response processing follows two patterns:

Non-streaming (complete response):

# Direct attribute access on typed response object
content = response.choices[0].message.content
tool_calls = response.choices[0].message.tool_calls
finish_reason = response.choices[0].finish_reason  # "stop", "length", "tool_calls"
tokens_used = response.usage.total_tokens

Streaming (incremental chunks):

# Accumulate deltas from chunk iterator
full_content = ""
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        full_content += delta.content

Related Pages

Implemented By

Implementation:Openai_Openai_python_Chat_Completion_Response

Uses Heuristic

Heuristic:Openai_Openai_python_Streaming_Resource_Management

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment