Principle:Openai Openai python Chat Completion Request
| Knowledge Sources | |
|---|---|
| Domains | NLP, Text_Generation |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
An API invocation pattern that sends a conversation context to a language model and receives generated text, structured data, or tool calls in response.
Description
The Chat Completion request is the core interaction pattern for generating text with OpenAI models. It takes a list of messages (conversation history), a model identifier, and optional parameters controlling generation behavior (temperature, max tokens, tools, structured output format). The API supports three output modes: standard text completion, streaming (Server-Sent Events for real-time token delivery), and structured output (model output constrained to a JSON schema defined by a Pydantic model).
Tool calling enables the model to request function executions by returning structured tool call objects instead of plain text. Structured outputs guarantee the response conforms to a provided schema.
Usage
Use this principle whenever you need to generate text from a language model given conversation context. Choose streaming mode for real-time UIs, structured output mode for data extraction, and tool calling for agentic workflows that integrate external functions.
Theoretical Basis
The request follows a Request-Response pattern with three variants:
# Standard completion
response = send_request(messages, model, params) -> ChatCompletion
# Streaming completion
stream = send_request(messages, model, stream=True) -> Iterator[ChatCompletionChunk]
# Structured completion (schema-constrained)
parsed = send_request(messages, model, response_format=Schema) -> ParsedChatCompletion[Schema]
The model applies autoregressive generation conditioned on the message context, with sampling controlled by temperature and top_p parameters. When tools are provided, the model may choose to emit a tool_calls array instead of text content.