Implementation:Microsoft Autogen OpenAIChatCompletionClient
| Knowledge Sources | |
|---|---|
| Domains | LLM Integration, Model Configuration, AI Agents, Multi-Agent Systems |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Concrete tool for configuring and connecting to OpenAI-compatible LLM endpoints provided by Microsoft AutoGen.
Description
OpenAIChatCompletionClient is the primary model client class in AutoGen for connecting to OpenAI and OpenAI-compatible API endpoints. It wraps the OpenAI Python SDK's async client, handles authentication, and provides a unified interface for sending chat completion requests. The client supports multiple providers through base URL routing: OpenAI, Azure OpenAI (via the sibling AzureOpenAIChatCompletionClient), Google Gemini, Anthropic, Meta Llama API, and any custom endpoint that follows the OpenAI API format.
During initialization, the client performs intelligent provider detection based on the model name prefix. For example, models starting with "gemini-" automatically route to Google's OpenAI-compatible endpoint, models starting with "claude-" route to Anthropic's endpoint, and models starting with "Llama-" route to Meta's endpoint. API keys are also resolved from environment variables when not provided explicitly.
The client also resolves model capabilities (function calling support, vision, JSON output) from an internal model info registry, with support for explicit overrides via the model_info parameter.
Usage
Import and instantiate OpenAIChatCompletionClient at the start of any AutoGen workflow before creating agents. Pass the resulting client instance to agent constructors as the model_client parameter. Use it whenever you need agents to communicate with an LLM.
Code Reference
Source Location
- Repository: Microsoft AutoGen
- File:
python/packages/autogen-ext/src/autogen_ext/models/openai/_openai_client.py(lines 1441-1494)
Signature
class OpenAIChatCompletionClient:
def __init__(self, **kwargs: Unpack[OpenAIClientConfiguration]):
...
The OpenAIClientConfiguration TypedDict defines all accepted keyword arguments.
Import
from autogen_ext.models.openai import OpenAIChatCompletionClient
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | str | Yes | The model identifier (e.g., "gpt-4o", "gpt-4o-mini", "gemini-1.5-pro", "claude-3.5-sonnet"). |
| api_key | str | No | API key for authentication. If omitted, resolved from environment variables (OPENAI_API_KEY, GEMINI_API_KEY, ANTHROPIC_API_KEY, LLAMA_API_KEY depending on provider). |
| base_url | str | No | Custom API endpoint URL. Auto-detected for known providers (Gemini, Anthropic, Llama). Use for self-hosted or proxy endpoints. |
| organization | str | No | OpenAI organization ID for billing and access control. |
| temperature | float or None | No | Sampling temperature controlling randomness (0.0 = deterministic, 2.0 = very random). |
| max_tokens | int or None | No | Maximum number of tokens to generate in the completion. |
| top_p | float or None | No | Nucleus sampling parameter. An alternative to temperature. |
| frequency_penalty | float or None | No | Penalty for token frequency to reduce repetition (-2.0 to 2.0). |
| presence_penalty | float or None | No | Penalty for token presence to encourage topic diversity (-2.0 to 2.0). |
| stop | str or List[str] or None | No | Stop sequences that halt generation when encountered. |
| seed | int or None | No | Random seed for reproducible outputs (best-effort by provider). |
| timeout | float or None | No | Request timeout in seconds. |
| max_retries | int | No | Maximum number of retry attempts for failed requests. |
| model_info | ModelInfo | No | Explicit model capability declaration overriding the built-in registry. |
| model_capabilities | ModelCapabilities | No | Deprecated capability declaration. Use model_info instead. |
| add_name_prefixes | bool | No | Whether to add name prefixes to messages. Defaults to False. |
| include_name_in_message | bool | No | Whether to include the 'name' field in user message parameters. Defaults to True. Set to False for providers that do not support it. |
| default_headers | Dict[str, str] or None | No | Custom HTTP headers to include with every request. |
| reasoning_effort | "minimal", "low", "medium", "high" | No | Controls reasoning depth for reasoning models (o1, o3-mini). |
| parallel_tool_calls | bool or None | No | Whether the model can make multiple tool calls in a single response. |
Outputs
| Name | Type | Description |
|---|---|---|
| instance | OpenAIChatCompletionClient | A configured model client instance that implements the ChatCompletionClient protocol. Pass this to agent constructors. |
Usage Examples
Basic Example
from autogen_ext.models.openai import OpenAIChatCompletionClient
# Basic OpenAI configuration
model_client = OpenAIChatCompletionClient(
model="gpt-4o",
api_key="sk-..."
)
Custom Endpoint Example
from autogen_ext.models.openai import OpenAIChatCompletionClient
# Connect to a local or custom OpenAI-compatible endpoint
model_client = OpenAIChatCompletionClient(
model="my-local-model",
base_url="http://localhost:8000/v1",
api_key="not-needed",
model_info={
"vision": False,
"function_calling": True,
"json_output": True,
"family": "unknown",
"structured_output": False,
}
)
Loading from Configuration
from autogen_core.models import ChatCompletionClient
config = {
"provider": "OpenAIChatCompletionClient",
"config": {"model": "gpt-4o", "api_key": "sk-..."},
}
client = ChatCompletionClient.load_component(config)