Implementation:Groq Groq python Completions Create
| Knowledge Sources | |
|---|---|
| Domains | NLP, API_Client |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Concrete tool for executing synchronous chat completion requests provided by the Groq Python SDK.
Description
The Completions.create() method sends a chat completion request to the Groq API endpoint /openai/v1/chat/completions. It accepts conversation messages, a model identifier, and optional generation parameters. The method uses Python @overload decorators to provide type-safe signatures for streaming vs non-streaming modes.
When stream is False or omitted, the method returns a fully-parsed ChatCompletion object. The request body is transformed via maybe_transform() against CompletionCreateParams TypedDict, and the response is deserialized into a Pydantic BaseModel.
Usage
Use this method for standard synchronous chat completions where you need the full response before continuing. Access it via client.chat.completions.create().
Code Reference
Source Location
- Repository: groq-python
- File: src/groq/resources/chat/completions.py
- Lines: L52-239 (overloads), L241-509 (implementation)
Signature
class Completions(SyncAPIResource):
def create(
self,
*,
messages: Iterable[ChatCompletionMessageParam],
model: Union[str, Literal[
"compound-beta", "compound-beta-mini", "gemma2-9b-it",
"llama-3.1-8b-instant", "llama-3.3-70b-versatile",
"meta-llama/llama-4-maverick-17b-128e-instruct",
"meta-llama/llama-4-scout-17b-16e-instruct",
"meta-llama/llama-guard-4-12b",
"qwen/qwen3-32b",
]],
temperature: Optional[float] | Omit = omit,
max_completion_tokens: Optional[int] | Omit = omit,
max_tokens: Optional[int] | Omit = omit,
top_p: Optional[float] | Omit = omit,
stream: Optional[Literal[False]] | Omit = omit,
stop: Union[Optional[str], SequenceNotStr[str], None] | Omit = omit,
response_format: Optional[ResponseFormat] | Omit = omit,
tools: Optional[Iterable[ChatCompletionToolParam]] | Omit = omit,
tool_choice: Optional[ChatCompletionToolChoiceOptionParam] | Omit = omit,
seed: Optional[int] | Omit = omit,
n: Optional[int] | Omit = omit,
frequency_penalty: Optional[float] | Omit = omit,
presence_penalty: Optional[float] | Omit = omit,
logprobs: Optional[bool] | Omit = omit,
# ... additional optional parameters
) -> ChatCompletion:
Import
from groq import Groq
# Access via: client.chat.completions.create(...)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| messages | Iterable[ChatCompletionMessageParam] | Yes | The conversation messages |
| model | str or Literal[...] | Yes | Model identifier (e.g., "llama-3.3-70b-versatile") |
| temperature | Optional[float] | No | Sampling temperature 0-2 |
| max_completion_tokens | Optional[int] | No | Max tokens to generate |
| top_p | Optional[float] | No | Nucleus sampling threshold |
| stop | str or List[str] or None | No | Stop sequence(s) |
| response_format | Optional[ResponseFormat] | No | JSON mode or structured output |
| tools | Optional[Iterable[ChatCompletionToolParam]] | No | Tool/function definitions |
| seed | Optional[int] | No | Random seed for reproducibility |
Outputs
| Name | Type | Description |
|---|---|---|
| (return) | ChatCompletion | Complete response with id, choices, created, model, usage fields |
Usage Examples
Basic Chat Completion
from groq import Groq
client = Groq()
response = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the importance of low latency LLMs"},
],
model="llama-3.3-70b-versatile",
temperature=0.5,
max_tokens=1024,
)
print(response.choices[0].message.content)
With JSON Response Format
from groq import Groq
client = Groq()
response = client.chat.completions.create(
messages=[
{"role": "system", "content": "Respond in JSON format."},
{"role": "user", "content": "List 3 programming languages and their uses."},
],
model="llama-3.3-70b-versatile",
response_format={"type": "json_object"},
)
import json
data = json.loads(response.choices[0].message.content)