Implementation:Anthropics Anthropic sdk python BetaUsage
| Knowledge Sources | |
|---|---|
| Domains | API Types, Beta |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
BetaUsage is a Pydantic model that tracks token consumption and usage metadata for beta API requests. It provides a detailed breakdown of input tokens, output tokens, cache statistics, server tool usage, and inference metadata associated with a single API call.
Description
The BetaUsage class extends BaseModel and represents the usage section of a beta API response. It contains required fields for input_tokens and output_tokens (both integers), along with several optional fields that provide granular insight into how tokens were consumed. These include cache creation and read counts, geographic inference region, per-iteration breakdowns for agentic loops, server tool use counts, service tier classification, and inference speed mode.
Key optional sub-models include:
- BetaCacheCreation -- Breakdown of cached tokens by TTL.
- BetaIterationsUsage -- Per-iteration token usage breakdown for multi-turn server-side tool use loops.
- BetaServerToolUsage -- Count of server tool requests made during the call.
Usage
Use BetaUsage when you need to inspect token consumption details from a beta Messages API response. It is typically accessed via the usage attribute on a BetaMessage response object. This is particularly useful for:
- Monitoring costs and token budgets.
- Determining whether prompt caching was utilized (via
cache_creation_input_tokensandcache_read_input_tokens). - Analyzing multi-iteration agentic workflows (via the
iterationsfield). - Understanding which service tier and speed mode were applied to a request.
Code Reference
Source Location
- Repository: Anthropic SDK Python
- File:
src/anthropic/types/beta/beta_usage.py
Signature
class BetaUsage(BaseModel):
cache_creation: Optional[BetaCacheCreation] = None
cache_creation_input_tokens: Optional[int] = None
cache_read_input_tokens: Optional[int] = None
inference_geo: Optional[str] = None
input_tokens: int
iterations: Optional[BetaIterationsUsage] = None
output_tokens: int
server_tool_use: Optional[BetaServerToolUsage] = None
service_tier: Optional[Literal["standard", "priority", "batch"]] = None
speed: Optional[Literal["standard", "fast"]] = None
Import
from anthropic.types.beta import BetaUsage
I/O Contract
Fields
| Field | Type | Required | Description |
|---|---|---|---|
input_tokens |
int |
Yes | The number of input tokens used. |
output_tokens |
int |
Yes | The number of output tokens used. |
cache_creation |
Optional[BetaCacheCreation] |
No | Breakdown of cached tokens by TTL. |
cache_creation_input_tokens |
Optional[int] |
No | Number of input tokens used to create the cache entry. |
cache_read_input_tokens |
Optional[int] |
No | Number of input tokens read from the cache. |
inference_geo |
Optional[str] |
No | Geographic region where inference was performed. |
iterations |
Optional[BetaIterationsUsage] |
No | Per-iteration token usage breakdown for server-side tool use loops. |
server_tool_use |
Optional[BetaServerToolUsage] |
No | Count of server tool requests. |
service_tier |
Optional[Literal["standard", "priority", "batch"]] |
No | The service tier used for the request. |
speed |
Optional[Literal["standard", "fast"]] |
No | The inference speed mode used. |
Usage Examples
import anthropic
client = anthropic.Anthropic()
response = client.beta.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, Claude"}],
betas=["max-tokens-3-5-sonnet-2024-07-15"],
)
usage = response.usage
print(f"Input tokens: {usage.input_tokens}")
print(f"Output tokens: {usage.output_tokens}")
if usage.cache_read_input_tokens:
print(f"Tokens read from cache: {usage.cache_read_input_tokens}")
if usage.service_tier:
print(f"Service tier: {usage.service_tier}")
Related Pages
- ParsedBetaMessage -- Beta message response that includes a
usagefield of this type. - Beta MessageCountTokensParams -- Parameters for counting tokens in beta API, related to usage estimation.