Implementation:Cohere ai Cohere python V2Client Chat Stream

Field	Value
Source Repo	Cohere Python SDK
Source Doc	Cohere Streaming
Domains	NLP, Text_Generation, Streaming
Last Updated	2026-02-15 14:00 GMT

Overview

Concrete method for sending streaming chat requests to Cohere models and receiving incremental Server-Sent Events.

Description

V2Client.chat_stream sends a streaming chat request to the Cohere V2 API. It returns an Iterator[V2ChatStreamResponse] that yields typed stream events. The method accepts the same parameters as V2Client.chat but uses the SSE streaming endpoint. Internally, it delegates to RawV2Client.chat_stream which uses the EventSource/SSEDecoder infrastructure to parse the event stream.

Usage

Call this method when you need incremental response delivery. Iterate over the returned iterator to process events as they arrive. The iterator yields typed V2ChatStreamResponse events discriminated by the type field.

Code Reference

Source Location: Repository cohere-ai/cohere-python https://github.com/cohere-ai/cohere-python, File src/cohere/v2/client.py, Lines L47-215

Signature:

def chat_stream(
    self,
    *,
    model: str,
    messages: ChatMessages,
    tools: typing.Optional[typing.Sequence[ToolV2]] = OMIT,
    strict_tools: typing.Optional[bool] = OMIT,
    documents: typing.Optional[typing.Sequence[V2ChatStreamRequestDocumentsItem]] = OMIT,
    citation_options: typing.Optional[CitationOptions] = OMIT,
    response_format: typing.Optional[ResponseFormatV2] = OMIT,
    safety_mode: typing.Optional[V2ChatStreamRequestSafetyMode] = OMIT,
    max_tokens: typing.Optional[int] = OMIT,
    stop_sequences: typing.Optional[typing.Sequence[str]] = OMIT,
    temperature: typing.Optional[float] = OMIT,
    seed: typing.Optional[int] = OMIT,
    frequency_penalty: typing.Optional[float] = OMIT,
    presence_penalty: typing.Optional[float] = OMIT,
    k: typing.Optional[int] = OMIT,
    p: typing.Optional[float] = OMIT,
    logprobs: typing.Optional[bool] = OMIT,
    tool_choice: typing.Optional[V2ChatStreamRequestToolChoice] = OMIT,
    thinking: typing.Optional[Thinking] = OMIT,
    priority: typing.Optional[int] = OMIT,
    request_options: typing.Optional[RequestOptions] = None,
) -> typing.Iterator[V2ChatStreamResponse]:

Import:

from cohere import ClientV2  # access via client.chat_stream()

I/O Contract

Inputs

Parameter	Required	Description
model (str)	Yes	Model name e.g. "command-a-03-2025"
messages (ChatMessages)	Yes	Conversation history
tools	No	Tool definitions for tool use
temperature (float)	No	Sampling temperature
max_tokens (int)	No	Maximum tokens to generate
stop_sequences	No	Sequences that stop generation
seed (int)	No	Random seed for reproducibility
frequency_penalty (float)	No	Frequency penalty
presence_penalty (float)	No	Presence penalty
k (int)	No	Top-k sampling parameter
p (float)	No	Top-p (nucleus) sampling parameter
logprobs (bool)	No	Whether to return log probabilities
tool_choice	No	Tool choice configuration
thinking	No	Thinking/reasoning configuration
priority (int)	No	Request priority
request_options	No	HTTP request options

Outputs

Iterator[V2ChatStreamResponse] yielding events with type field:

Event Type	Description
message-start	Initial event with message ID
content-start	Marks beginning of content block
content-delta	Incremental text token (in `delta.message.content.text`)
content-end	Marks end of content block
tool-plan-delta	Incremental tool planning text
tool-call-start	Beginning of tool call
tool-call-delta	Incremental tool call arguments
tool-call-end	End of tool call
citation-start	Citation begin
citation-end	Citation end
message-end	Final event with finish_reason and usage
debug	Debug information

Usage Examples

from cohere import ClientV2, UserChatMessageV2

client = ClientV2()

for event in client.chat_stream(
    model="command-a-03-2025",
    messages=[UserChatMessageV2(content="Write a poem about the ocean.")],
):
    if event.type == "content-delta":
        print(event.delta.message.content.text, end="")
    elif event.type == "message-end":
        print()
        print(f"Finish reason: {event.delta.finish_reason}")
        if event.delta.usage:
            print(f"Usage: {event.delta.usage}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment