Implementation:Openai Openai python HTTP Response Handler

Knowledge Sources	Openai_Openai_python
Domains	SDK_Infrastructure, Python
Last Updated	2026-02-15 00:00 GMT

Overview

Concrete tool for modern response handling, streaming support, and binary response wrappers provided by the openai-python SDK.

Description

The _response.py module (849 lines) implements the current (non-legacy) response handling system. It replaces LegacyAPIResponse with properly separated sync and async classes. Key components:

BaseAPIResponse[R] (line 48): Abstract generic base class that wraps an httpx.Response. Provides:
- Properties: headers, http_request, status_code, url, method, http_version, elapsed, is_closed
- Stores retries_taken count and the target _cast_to type
- Internal _parse() method (line 128) handles type dispatch: SSE streams, NoneType, primitives (str, bytes, int, float, bool), HttpxBinaryResponseContent, raw httpx.Response, and Pydantic BaseModel JSON deserialization
- Caches parsed results per target type in _parsed_by_type

APIResponse[R] (line 274): Synchronous response class. Key methods:
- parse() (line 285): Reads the response body (via read()) then delegates to _parse(); applies post_parser and attaches request_id to BaseModel results
- read() (line 333): Returns raw bytes; raises StreamAlreadyConsumed if the stream was already read
- text() (line 343), json() (line 348): Decoded response content
- close() (line 353), iter_bytes() (line 360), iter_text() (line 369), iter_lines() (line 377): Streaming iteration methods

AsyncAPIResponse[R] (line 383): Async counterpart of APIResponse. All methods are async, using aread(), aiter_bytes(), etc. under the hood.

Binary response subclasses:
- BinaryAPIResponse (line 490): Extends APIResponse[bytes]; adds write_to_file(file) for writing content to disk
- AsyncBinaryAPIResponse (line 515): Async counterpart using anyio.Path
- StreamedBinaryAPIResponse (line 541): For streamed binary data; provides stream_to_file(file, chunk_size)
- AsyncStreamedBinaryAPIResponse (line 557): Async streamed binary variant

Context managers:
- ResponseContextManager[_APIResponseT] (line 615): Sync context manager that defers the request until __enter__ and calls close() on __exit__
- AsyncResponseContextManager[_AsyncAPIResponseT] (line 639): Async counterpart

Higher-order wrapper functions (lines 663-830): Transform bound API methods to support raw and streamed responses:
- to_streamed_response_wrapper, async_to_streamed_response_wrapper: Wrap for streaming with context manager
- to_raw_response_wrapper, async_to_raw_response_wrapper: Wrap for raw response access
- to_custom_streamed_response_wrapper, to_custom_raw_response_wrapper: Support custom response class overrides via OVERRIDE_CAST_TO_HEADER

extract_response_type(typ) (line 833): Extracts the generic type variable T from APIResponse[T] or concrete subclasses.

Usage

These classes are returned by API method calls when using .with_raw_response or .with_streaming_response on the client. The APIResponse is also used internally by the client infrastructure for all non-legacy response handling.

Code Reference

Source Location

Repository: openai-python
File: src/openai/_response.py
Lines: 1-849

Signature

class BaseAPIResponse(Generic[R]):
    http_response: httpx.Response
    retries_taken: int

    def __init__(
        self,
        *,
        raw: httpx.Response,
        cast_to: type[R],
        client: BaseClient[Any, Any],
        stream: bool,
        stream_cls: type[Stream[Any]] | type[AsyncStream[Any]] | None,
        options: FinalRequestOptions,
        retries_taken: int = 0,
    ) -> None: ...

class APIResponse(BaseAPIResponse[R]):
    def parse(self, *, to: type[_T] | None = None) -> R | _T: ...
    def read(self) -> bytes: ...
    def text(self) -> str: ...
    def json(self) -> object: ...
    def close(self) -> None: ...
    def iter_bytes(self, chunk_size: int | None = None) -> Iterator[bytes]: ...
    def iter_text(self, chunk_size: int | None = None) -> Iterator[str]: ...
    def iter_lines(self) -> Iterator[str]: ...

class AsyncAPIResponse(BaseAPIResponse[R]):
    async def parse(self, *, to: type[_T] | None = None) -> R | _T: ...
    async def read(self) -> bytes: ...
    async def text(self) -> str: ...
    async def json(self) -> object: ...
    async def close(self) -> None: ...
    async def iter_bytes(self, chunk_size: int | None = None) -> AsyncIterator[bytes]: ...

class BinaryAPIResponse(APIResponse[bytes]):
    def write_to_file(self, file: str | os.PathLike[str]) -> None: ...

class AsyncBinaryAPIResponse(AsyncAPIResponse[bytes]):
    async def write_to_file(self, file: str | os.PathLike[str]) -> None: ...

def extract_response_type(typ: type[BaseAPIResponse[Any]]) -> type: ...

Import

from openai import APIResponse, AsyncAPIResponse
from openai._response import (
    APIResponse,
    AsyncAPIResponse,
    BinaryAPIResponse,
    AsyncBinaryAPIResponse,
    StreamedBinaryAPIResponse,
    AsyncStreamedBinaryAPIResponse,
)

I/O Contract

Inputs

Name	Type	Required	Description
raw	httpx.Response	Yes	The raw httpx response to wrap
cast_to	type[R]	Yes	Target type for parsing the response body
client	BaseClient[Any, Any]	Yes	Client instance that originated the request
stream	bool	Yes	Whether this is an SSE streaming response
stream_cls	type[Stream] or type[AsyncStream] or None	Yes	Stream class for SSE decoding
options	FinalRequestOptions	Yes	Request options including post_parser
retries_taken	int	No	Number of retries performed; defaults to 0

Outputs

Name	Type	Description
parse() return	R or _T	Typed parsed response data
read() return	bytes	Raw response body bytes
text() return	str	Decoded response body string
json() return	object	Parsed JSON response content
request_id	str or None	`x-request-id` header value
status_code	int	HTTP status code
headers	httpx.Headers	Response headers

Usage Examples

Raw Response Access

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.with_raw_response.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

print(response.status_code)        # 200
print(response.request_id)         # "req_abc123..."
print(response.headers["x-ratelimit-remaining-tokens"])

completion = response.parse()       # Parse into ChatCompletion
print(completion.choices[0].message.content)

Streaming Response

from openai import OpenAI

client = OpenAI()
with client.chat.completions.with_streaming_response.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
) as response:
    print(response.status_code)
    for line in response.iter_lines():
        print(line)

Binary File Download

from openai import OpenAI

client = OpenAI()
response = client.files.content("file-abc123")
response.write_to_file("downloaded_file.jsonl")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment