Overview
AnthropicModel is a legacy model wrapper in the phoenix-evals package that provides an interface for using Anthropic Claude models within Phoenix LLM evaluations. It extends BaseModel and integrates with the Anthropic SDK, providing both synchronous and asynchronous generation, dynamic rate limiting that adjusts based on API throttling responses, and structured token usage tracking. The wrapper formats prompts into the Anthropic Messages API format and handles context limit errors gracefully.
LLM_Evaluation
Model_Integration
Description
The AnthropicModel class is implemented as a Python dataclass that extends the abstract BaseModel. Key characteristics include:
- Anthropic SDK integration: Initializes both synchronous (
Anthropic()) and asynchronous (AsyncAnthropic()) clients from the anthropic package.
- Version validation: Enforces a minimum Anthropic SDK version of
0.18.0 at initialization time.
- Dynamic rate limiting: Uses the
RateLimiter infrastructure, configured to intercept anthropic.RateLimitError with an initial rate of 5 requests per second and a 1-minute enforcement window.
- Context limit handling: Catches
BadRequestError exceptions containing "prompt is too long" messages and re-raises them as PhoenixContextLimitExceeded.
- Full async support: Both
_generate_with_extra() and _async_generate_with_extra() are natively implemented.
- Tool use support: The
_extract_text() method detects tool use blocks in responses and serializes them as JSON.
- Cache-aware usage tracking: Token usage extraction accounts for
cache_creation_input_tokens and cache_read_input_tokens in addition to standard input_tokens.
Usage
# Set the ANTHROPIC_API_KEY environment variable before use
from phoenix.evals.models import AnthropicModel
# Basic usage with defaults (claude-2.1)
model = AnthropicModel()
# Specify a different model and parameters
model = AnthropicModel(
model="claude-3-5-sonnet-20241022",
temperature=0.5,
max_tokens=2048,
top_p=0.9,
initial_rate_limit=10,
)
# Direct invocation
response = model("Explain quantum computing in simple terms.")
print(response)
Code Reference
Source Location
| Property |
Value
|
| Repository |
Arize-ai/phoenix
|
| File |
packages/phoenix-evals/src/phoenix/evals/legacy/models/anthropic.py
|
| Lines |
226
|
| Module |
phoenix.evals.legacy.models.anthropic
|
Class Signature
@dataclass
class AnthropicModel(BaseModel):
model: str = "claude-2.1"
temperature: float = 0.0
max_tokens: int = 1024
top_p: float = 1
top_k: int = 256
stop_sequences: List[str] = field(default_factory=list)
extra_parameters: Dict[str, Any] = field(default_factory=dict)
max_content_size: Optional[int] = None
initial_rate_limit: int = 5
timeout: int = 120
Constructor Parameters
| Parameter |
Type |
Default |
Description
|
| model |
str |
"claude-2.1" |
The Anthropic model name to use.
|
| temperature |
float |
0.0 |
Sampling temperature for generation.
|
| max_tokens |
int |
1024 |
Maximum number of tokens to generate.
|
| top_p |
float |
1 |
Nucleus sampling probability mass.
|
| top_k |
int |
256 |
Top-K sampling cutoff.
|
| stop_sequences |
List[str] |
[] |
Sequences that halt generation.
|
| extra_parameters |
Dict[str, Any] |
{} |
Extra parameters for the request body (e.g., countPenalty for A21 models).
|
| max_content_size |
Optional[int] |
None |
Maximum content size for fine-tuned models.
|
| initial_rate_limit |
int |
5 |
Initial requests-per-second rate limit.
|
| timeout |
int |
120 |
Timeout for API requests in seconds.
|
Key Methods
| Method |
Signature |
Description
|
| __post_init__ |
(self) -> None |
Initializes the Anthropic client and rate limiter.
|
| _init_client |
(self) -> None |
Validates SDK version and creates sync/async Anthropic clients.
|
| _init_rate_limiter |
(self) -> None |
Configures the rate limiter with RateLimitError.
|
| invocation_parameters |
(self) -> Dict[str, Any] |
Returns the API call parameters dictionary.
|
| _generate_with_extra |
(self, prompt, **kwargs) -> Tuple[str, ExtraInfo] |
Synchronous generation with rate limiting.
|
| _async_generate_with_extra |
async (self, prompt, **kwargs) -> Tuple[str, ExtraInfo] |
Asynchronous generation with rate limiting.
|
| _format_prompt_for_claude |
(self, prompt: MultimodalPrompt) -> List[Dict[str, str]] |
Converts a MultimodalPrompt to Anthropic Messages API format.
|
| _extract_text |
(self, message: Message) -> str |
Extracts text or tool-use JSON from an Anthropic response.
|
| _extract_usage |
(self, message_usage: MessageUsage) -> Usage |
Extracts token usage including cache tokens.
|
| _parse_output |
(self, message: Message) -> Tuple[str, ExtraInfo] |
Combines text extraction and usage extraction.
|
Import
from phoenix.evals.models import AnthropicModel
I/O Contract
| Direction |
Type |
Description
|
| Input |
Union[str, MultimodalPrompt] |
A text string or multimodal prompt (only text parts supported for Anthropic).
|
| Input (optional) |
Optional[str] |
Instruction parameter (ignored; stripped before API call).
|
| Output |
str |
Generated text response, or JSON-serialized tool use arguments.
|
| Output (with extra) |
Tuple[str, ExtraInfo] |
Generated text paired with ExtraInfo containing Usage token counts.
|
| Error |
PhoenixContextLimitExceeded |
Raised when prompt exceeds the model's context window.
|
| Error |
ImportError |
Raised if anthropic package is not installed or version is too old.
|
Usage Examples
Basic Generation
from phoenix.evals.models import AnthropicModel
model = AnthropicModel(model="claude-3-5-sonnet-20241022", temperature=0.0)
response = model("What are the benefits of test-driven development?")
print(response)
Async Generation
import asyncio
from phoenix.evals.models import AnthropicModel
model = AnthropicModel(model="claude-3-5-sonnet-20241022")
async def generate():
result = await model._async_generate("Summarize the theory of relativity.")
return result
response = asyncio.run(generate())
print(response)
With Custom Rate Limiting
from phoenix.evals.models import AnthropicModel
# Higher rate limit for batch processing
model = AnthropicModel(
model="claude-3-haiku-20240307",
initial_rate_limit=20,
max_tokens=512,
)
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.