Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Groq Groq python Completions Create

From Leeroopedia
Knowledge Sources
Domains NLP, API_Client
Last Updated 2026-02-15 16:00 GMT

Overview

Concrete tool for executing synchronous chat completion requests provided by the Groq Python SDK.

Description

The Completions.create() method sends a chat completion request to the Groq API endpoint /openai/v1/chat/completions. It accepts conversation messages, a model identifier, and optional generation parameters. The method uses Python @overload decorators to provide type-safe signatures for streaming vs non-streaming modes.

When stream is False or omitted, the method returns a fully-parsed ChatCompletion object. The request body is transformed via maybe_transform() against CompletionCreateParams TypedDict, and the response is deserialized into a Pydantic BaseModel.

Usage

Use this method for standard synchronous chat completions where you need the full response before continuing. Access it via client.chat.completions.create().

Code Reference

Source Location

  • Repository: groq-python
  • File: src/groq/resources/chat/completions.py
  • Lines: L52-239 (overloads), L241-509 (implementation)

Signature

class Completions(SyncAPIResource):
    def create(
        self,
        *,
        messages: Iterable[ChatCompletionMessageParam],
        model: Union[str, Literal[
            "compound-beta", "compound-beta-mini", "gemma2-9b-it",
            "llama-3.1-8b-instant", "llama-3.3-70b-versatile",
            "meta-llama/llama-4-maverick-17b-128e-instruct",
            "meta-llama/llama-4-scout-17b-16e-instruct",
            "meta-llama/llama-guard-4-12b",
            "qwen/qwen3-32b",
        ]],
        temperature: Optional[float] | Omit = omit,
        max_completion_tokens: Optional[int] | Omit = omit,
        max_tokens: Optional[int] | Omit = omit,
        top_p: Optional[float] | Omit = omit,
        stream: Optional[Literal[False]] | Omit = omit,
        stop: Union[Optional[str], SequenceNotStr[str], None] | Omit = omit,
        response_format: Optional[ResponseFormat] | Omit = omit,
        tools: Optional[Iterable[ChatCompletionToolParam]] | Omit = omit,
        tool_choice: Optional[ChatCompletionToolChoiceOptionParam] | Omit = omit,
        seed: Optional[int] | Omit = omit,
        n: Optional[int] | Omit = omit,
        frequency_penalty: Optional[float] | Omit = omit,
        presence_penalty: Optional[float] | Omit = omit,
        logprobs: Optional[bool] | Omit = omit,
        # ... additional optional parameters
    ) -> ChatCompletion:

Import

from groq import Groq
# Access via: client.chat.completions.create(...)

I/O Contract

Inputs

Name Type Required Description
messages Iterable[ChatCompletionMessageParam] Yes The conversation messages
model str or Literal[...] Yes Model identifier (e.g., "llama-3.3-70b-versatile")
temperature Optional[float] No Sampling temperature 0-2
max_completion_tokens Optional[int] No Max tokens to generate
top_p Optional[float] No Nucleus sampling threshold
stop str or List[str] or None No Stop sequence(s)
response_format Optional[ResponseFormat] No JSON mode or structured output
tools Optional[Iterable[ChatCompletionToolParam]] No Tool/function definitions
seed Optional[int] No Random seed for reproducibility

Outputs

Name Type Description
(return) ChatCompletion Complete response with id, choices, created, model, usage fields

Usage Examples

Basic Chat Completion

from groq import Groq

client = Groq()

response = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the importance of low latency LLMs"},
    ],
    model="llama-3.3-70b-versatile",
    temperature=0.5,
    max_tokens=1024,
)

print(response.choices[0].message.content)

With JSON Response Format

from groq import Groq

client = Groq()

response = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "Respond in JSON format."},
        {"role": "user", "content": "List 3 programming languages and their uses."},
    ],
    model="llama-3.3-70b-versatile",
    response_format={"type": "json_object"},
)

import json
data = json.loads(response.choices[0].message.content)

Related Pages

Implements Principle

Requires Environment

Uses Heuristics

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment