Implementation:Groq Groq python Stream Iterator
| Knowledge Sources | |
|---|---|
| Domains | Streaming, Data_Parsing |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Concrete iterator classes for consuming SSE-streamed chat completion chunks provided by the Groq Python SDK.
Description
The Stream[T] class (and its async counterpart AsyncStream[T]) provides the core interface for iterating over streamed responses. It wraps an httpx.Response byte stream, decodes SSE events via SSEBytesDecoder, parses JSON data, and yields typed ChatCompletionChunk objects. The stream terminates when a [DONE] SSE message is received.
The ChatCompletionChunk model represents a single streaming event with a choices list containing ChoiceDelta objects (delta.content, delta.role, delta.tool_calls) and optional x_groq metadata (usage stats in the final chunk).
Usage
Iterate over the Stream object returned by create(stream=True) using a for loop. Each iteration yields a ChatCompletionChunk. Access chunk.choices[0].delta.content for token text. The stream also supports context manager protocol (with statement) for automatic cleanup.
Code Reference
Source Location
- Repository: groq-python
- File: src/groq/_streaming.py (Stream class: L22-120, AsyncStream: L122-221)
- File: src/groq/types/chat/chat_completion_chunk.py (ChatCompletionChunk: L183-224)
Signature
class Stream(Generic[_T]):
response: httpx.Response
def __init__(self, *, cast_to: type[_T], response: httpx.Response, client: Groq) -> None: ...
def __iter__(self) -> Iterator[_T]: ...
def __next__(self) -> _T: ...
def close(self) -> None: ...
def __enter__(self) -> Self: ...
def __exit__(self, ...) -> None: ...
class ChatCompletionChunk(BaseModel):
id: str
choices: List[Choice]
created: int
model: str
object: Literal["chat.completion.chunk"]
x_groq: Optional[XGroq] = None
Import
from groq import Stream
from groq.types.chat import ChatCompletionChunk
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (stream) | Stream[ChatCompletionChunk] | Yes | Returned from create(stream=True) |
Outputs
| Name | Type | Description |
|---|---|---|
| chunk.choices[i].delta.content | Optional[str] | Token text fragment |
| chunk.choices[i].delta.role | Optional[str] | Role (set on first chunk only) |
| chunk.choices[i].finish_reason | Optional[str] | None until final chunk, then "stop", "length", etc. |
| chunk.x_groq.usage | Optional[CompletionUsage] | Token usage (final chunk only) |
Usage Examples
Token-by-Token Processing
from groq import Groq
client = Groq()
stream = client.chat.completions.create(
messages=[{"role": "user", "content": "Tell me a joke"}],
model="llama-3.3-70b-versatile",
stream=True,
)
full_text = ""
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
full_text += delta.content
print(delta.content, end="")
if chunk.choices[0].finish_reason:
assert chunk.x_groq is not None
assert chunk.x_groq.usage is not None
print(f"\n\nUsage: {chunk.x_groq.usage}")
Context Manager Usage
with client.chat.completions.create(
messages=[{"role": "user", "content": "Hello"}],
model="llama-3.3-70b-versatile",
stream=True,
) as stream:
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
# Stream automatically closed on exit