Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Cohere ai Cohere python AwsGeneration

From Leeroopedia
Revision as of 12:16, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Cohere_ai_Cohere_python_AwsGeneration.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains SDK, AWS, Text Generation
Last Updated 2026-02-15 14:00 GMT

Overview

Implements text generation result structures and a streaming response handler for Cohere generation models deployed on AWS.

Description

The AwsGeneration module provides TokenLikelihood, Generation, Generations, StreamingText, and StreamingGenerations classes for handling text generation outputs from Cohere models running on AWS SageMaker or Amazon Bedrock. The Generations class includes a from_dict factory method for deserializing raw API responses into structured objects. The StreamingGenerations class handles chunked streaming responses from both SageMaker and Bedrock endpoints, reassembling partial JSON payloads and yielding StreamingText named tuples as text fragments arrive.

Usage

Use these classes when consuming text generation results from Cohere models deployed on AWS. The non-streaming classes (Generation, Generations) are used for synchronous invocations, while StreamingGenerations is used when streaming is enabled, allowing incremental processing of generated text as it arrives from the endpoint.

Code Reference

Source Location

  • Repository: Cohere Python SDK
  • File: src/cohere/manually_maintained/cohere_aws/generation.py

Signature

class TokenLikelihood(CohereObject):
    def __init__(self, token: str, likelihood: float) -> None: ...

class Generation(CohereObject):
    def __init__(self, text: str, token_likelihoods: List[TokenLikelihood]) -> None: ...

class Generations(CohereObject):
    def __init__(self, generations: List[Generation]) -> None: ...
    @classmethod
    def from_dict(cls, response: Dict[str, Any]) -> "Generations": ...
    def __iter__(self) -> iter: ...
    def __next__(self) -> next: ...

StreamingText = NamedTuple("StreamingText", [
    ("index", Optional[int]),
    ("text", str),
    ("is_finished", bool),
])

class StreamingGenerations(CohereObject):
    def __init__(self, stream, mode: Mode) -> None: ...
    def _make_response_item(self, streaming_item) -> Optional[StreamingText]: ...
    def __iter__(self) -> Generator[StreamingText, None, None]: ...

Import

from cohere.manually_maintained.cohere_aws.generation import (
    TokenLikelihood,
    Generation,
    Generations,
    StreamingText,
    StreamingGenerations,
)

I/O Contract

TokenLikelihood

Parameter Type Description
token str The text token.
likelihood float The log-likelihood score for this token.

Generation

Parameter Type Description
text str The generated text content.
token_likelihoods List[TokenLikelihood] Per-token likelihood scores for the generated text. May be None if not requested.

Generations

Parameter Type Description
generations List[Generation] A list of Generation objects.
Method Return Type Description
from_dict(response) Generations Class method that parses a raw API response dictionary (with a "generations" key) into a Generations instance.
__iter__() Iterator Returns an iterator over the contained Generation objects.
__next__() Generation Returns the next Generation from the internal iterator.

StreamingText (NamedTuple)

Field Type Description
index Optional[int] The index of the generation stream this text belongs to.
text str The text fragment received from the stream.
is_finished bool Whether this is the final text chunk in the stream.

StreamingGenerations

Parameter Type Description
stream (iterable) The raw streaming response from the AWS endpoint (SageMaker or Bedrock).
mode Mode The deployment mode, either Mode.SAGEMAKER or Mode.BEDROCK. Determines the payload and byte keys used to parse stream chunks.
Attribute Type Description
id str or None The response ID, populated once the stream completes.
generations Generations or None The full Generations object, populated from the final stream item.
finish_reason str or None The reason the generation stream ended (e.g., "COMPLETE").

Usage Examples

from cohere.manually_maintained.cohere_aws.generation import (
    Generations,
    StreamingGenerations,
)
from cohere.manually_maintained.cohere_aws.mode import Mode

# Parse a synchronous generation response
response_dict = {
    "generations": [
        {
            "text": "The capital of France is Paris.",
            "token_likelihoods": [
                {"token": "The", "likelihood": -0.5},
                {"token": " capital", "likelihood": -0.3},
            ]
        }
    ]
}
generations = Generations.from_dict(response_dict)
for gen in generations:
    print(gen.text)  # "The capital of France is Paris."
    for tl in gen.token_likelihoods:
        print(f"  {tl.token}: {tl.likelihood}")

# Streaming usage (conceptual example with a SageMaker endpoint)
# stream = sagemaker_client.invoke_endpoint_with_response_stream(...)["Body"]
# streaming_gens = StreamingGenerations(stream, mode=Mode.SAGEMAKER)
# for text_chunk in streaming_gens:
#     print(text_chunk.text, end="")
# print()
# print(f"Finish reason: {streaming_gens.finish_reason}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment