Implementation:Groq Groq python Stream Iterator

Knowledge Sources	Groq Python SDK Groq API Docs
Domains	Streaming, Data_Parsing
Last Updated	2026-02-15 16:00 GMT

Overview

Concrete iterator classes for consuming SSE-streamed chat completion chunks provided by the Groq Python SDK.

Description

The Stream[T] class (and its async counterpart AsyncStream[T]) provides the core interface for iterating over streamed responses. It wraps an httpx.Response byte stream, decodes SSE events via SSEBytesDecoder, parses JSON data, and yields typed ChatCompletionChunk objects. The stream terminates when a [DONE] SSE message is received.

The ChatCompletionChunk model represents a single streaming event with a choices list containing ChoiceDelta objects (delta.content, delta.role, delta.tool_calls) and optional x_groq metadata (usage stats in the final chunk).

Usage

Iterate over the Stream object returned by create(stream=True) using a for loop. Each iteration yields a ChatCompletionChunk. Access chunk.choices[0].delta.content for token text. The stream also supports context manager protocol (with statement) for automatic cleanup.

Code Reference

Source Location

Repository: groq-python
File: src/groq/_streaming.py (Stream class: L22-120, AsyncStream: L122-221)
File: src/groq/types/chat/chat_completion_chunk.py (ChatCompletionChunk: L183-224)

Signature

class Stream(Generic[_T]):
    response: httpx.Response

    def __init__(self, *, cast_to: type[_T], response: httpx.Response, client: Groq) -> None: ...
    def __iter__(self) -> Iterator[_T]: ...
    def __next__(self) -> _T: ...
    def close(self) -> None: ...
    def __enter__(self) -> Self: ...
    def __exit__(self, ...) -> None: ...

class ChatCompletionChunk(BaseModel):
    id: str
    choices: List[Choice]
    created: int
    model: str
    object: Literal["chat.completion.chunk"]
    x_groq: Optional[XGroq] = None

Import

from groq import Stream
from groq.types.chat import ChatCompletionChunk

I/O Contract

Inputs

Name	Type	Required	Description
(stream)	Stream[ChatCompletionChunk]	Yes	Returned from create(stream=True)

Outputs

Name	Type	Description
chunk.choices[i].delta.content	Optional[str]	Token text fragment
chunk.choices[i].delta.role	Optional[str]	Role (set on first chunk only)
chunk.choices[i].finish_reason	Optional[str]	None until final chunk, then "stop", "length", etc.
chunk.x_groq.usage	Optional[CompletionUsage]	Token usage (final chunk only)

Usage Examples

Token-by-Token Processing

from groq import Groq

client = Groq()
stream = client.chat.completions.create(
    messages=[{"role": "user", "content": "Tell me a joke"}],
    model="llama-3.3-70b-versatile",
    stream=True,
)

full_text = ""
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        full_text += delta.content
        print(delta.content, end="")

    if chunk.choices[0].finish_reason:
        assert chunk.x_groq is not None
        assert chunk.x_groq.usage is not None
        print(f"\n\nUsage: {chunk.x_groq.usage}")

Context Manager Usage

with client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}],
    model="llama-3.3-70b-versatile",
    stream=True,
) as stream:
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")
# Stream automatically closed on exit

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment