Implementation:Cohere ai Cohere python V2Client Chat Stream
| Field | Value |
|---|---|
| Source Repo | Cohere Python SDK |
| Source Doc | Cohere Streaming |
| Domains | NLP, Text_Generation, Streaming |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
Concrete method for sending streaming chat requests to Cohere models and receiving incremental Server-Sent Events.
Description
V2Client.chat_stream sends a streaming chat request to the Cohere V2 API. It returns an Iterator[V2ChatStreamResponse] that yields typed stream events. The method accepts the same parameters as V2Client.chat but uses the SSE streaming endpoint. Internally, it delegates to RawV2Client.chat_stream which uses the EventSource/SSEDecoder infrastructure to parse the event stream.
Usage
Call this method when you need incremental response delivery. Iterate over the returned iterator to process events as they arrive. The iterator yields typed V2ChatStreamResponse events discriminated by the type field.
Code Reference
- Source Location: Repository cohere-ai/cohere-python https://github.com/cohere-ai/cohere-python, File
src/cohere/v2/client.py, Lines L47-215
Signature:
def chat_stream(
self,
*,
model: str,
messages: ChatMessages,
tools: typing.Optional[typing.Sequence[ToolV2]] = OMIT,
strict_tools: typing.Optional[bool] = OMIT,
documents: typing.Optional[typing.Sequence[V2ChatStreamRequestDocumentsItem]] = OMIT,
citation_options: typing.Optional[CitationOptions] = OMIT,
response_format: typing.Optional[ResponseFormatV2] = OMIT,
safety_mode: typing.Optional[V2ChatStreamRequestSafetyMode] = OMIT,
max_tokens: typing.Optional[int] = OMIT,
stop_sequences: typing.Optional[typing.Sequence[str]] = OMIT,
temperature: typing.Optional[float] = OMIT,
seed: typing.Optional[int] = OMIT,
frequency_penalty: typing.Optional[float] = OMIT,
presence_penalty: typing.Optional[float] = OMIT,
k: typing.Optional[int] = OMIT,
p: typing.Optional[float] = OMIT,
logprobs: typing.Optional[bool] = OMIT,
tool_choice: typing.Optional[V2ChatStreamRequestToolChoice] = OMIT,
thinking: typing.Optional[Thinking] = OMIT,
priority: typing.Optional[int] = OMIT,
request_options: typing.Optional[RequestOptions] = None,
) -> typing.Iterator[V2ChatStreamResponse]:
Import:
from cohere import ClientV2 # access via client.chat_stream()
I/O Contract
Inputs
| Parameter | Required | Description |
|---|---|---|
| model (str) | Yes | Model name e.g. "command-a-03-2025" |
| messages (ChatMessages) | Yes | Conversation history |
| tools | No | Tool definitions for tool use |
| temperature (float) | No | Sampling temperature |
| max_tokens (int) | No | Maximum tokens to generate |
| stop_sequences | No | Sequences that stop generation |
| seed (int) | No | Random seed for reproducibility |
| frequency_penalty (float) | No | Frequency penalty |
| presence_penalty (float) | No | Presence penalty |
| k (int) | No | Top-k sampling parameter |
| p (float) | No | Top-p (nucleus) sampling parameter |
| logprobs (bool) | No | Whether to return log probabilities |
| tool_choice | No | Tool choice configuration |
| thinking | No | Thinking/reasoning configuration |
| priority (int) | No | Request priority |
| request_options | No | HTTP request options |
Outputs
Iterator[V2ChatStreamResponse] yielding events with type field:
| Event Type | Description |
|---|---|
| message-start | Initial event with message ID |
| content-start | Marks beginning of content block |
| content-delta | Incremental text token (in delta.message.content.text)
|
| content-end | Marks end of content block |
| tool-plan-delta | Incremental tool planning text |
| tool-call-start | Beginning of tool call |
| tool-call-delta | Incremental tool call arguments |
| tool-call-end | End of tool call |
| citation-start | Citation begin |
| citation-end | Citation end |
| message-end | Final event with finish_reason and usage |
| debug | Debug information |
Usage Examples
from cohere import ClientV2, UserChatMessageV2
client = ClientV2()
for event in client.chat_stream(
model="command-a-03-2025",
messages=[UserChatMessageV2(content="Write a poem about the ocean.")],
):
if event.type == "content-delta":
print(event.delta.message.content.text, end="")
elif event.type == "message-end":
print()
print(f"Finish reason: {event.delta.finish_reason}")
if event.delta.usage:
print(f"Usage: {event.delta.usage}")