Environment:Openai Evals Optional Provider APIs

Knowledge Sources	OpenAI Evals Anthropic API Google Gemini API Together AI API
Domains	Infrastructure, API_Configuration
Last Updated	2026-02-14 10:00 GMT

Overview

API key configuration for optional third-party LLM providers: Anthropic, Google Gemini, and Together AI.

Description

The OpenAI Evals framework supports evaluating models from multiple LLM providers beyond OpenAI. Each provider requires its own API key configured via environment variables. These are optional dependencies: they are only needed when running evaluations against the specific provider's models. The provider solver implementations are found under `evals/solvers/providers/`.

Usage

Use this environment when running evaluations against non-OpenAI models, such as Anthropic Claude, Google Gemini, or Together AI hosted models. Each provider's API key is only required when that specific solver is invoked.

System Requirements

Category	Requirement	Notes
Network	Internet access	Required for provider API calls
API Accounts	Account with desired provider	Anthropic, Google AI Studio, or Together AI

Dependencies

Python Packages

`anthropic` (for Anthropic/Claude models)
`google-generativeai` (for Google Gemini models)
`google-api-core` (transitive dependency of google-generativeai)

Credentials

The following environment variables are needed depending on the provider:

`GEMINI_API_KEY`: Google Gemini API key. Used in `evals/solvers/providers/google/gemini_solver.py:16`.
`TOGETHER_API_KEY`: Together AI API key. Used in `evals/solvers/providers/together/together_solver.py:59`.

Note: The Anthropic solver uses the `anthropic` library which reads `ANTHROPIC_API_KEY` from the environment by default (standard anthropic SDK behavior).

Quick Install

# Anthropic (Claude)
export ANTHROPIC_API_KEY="sk-ant-..."

# Google Gemini
export GEMINI_API_KEY="AI..."

# Together AI
export TOGETHER_API_KEY="..."

Code Evidence

Gemini API key loading from `evals/solvers/providers/google/gemini_solver.py:16-17`:

API_KEY = os.environ.get("GEMINI_API_KEY")
genai.configure(api_key=API_KEY)

Together API key loading from `evals/solvers/providers/together/together_solver.py:59`:

return os.environ.get("TOGETHER_API_KEY")

Anthropic client initialization from `evals/solvers/providers/anthropic/anthropic_solver.py:3-4`:

import anthropic
from anthropic import Anthropic

Gemini retry exceptions from `evals/solvers/providers/google/gemini_solver.py:37-41`:

GEMINI_RETRY_EXCEPTIONS = (
    google.api_core.exceptions.RetryError,
    google.api_core.exceptions.TooManyRequests,
    google.api_core.exceptions.ResourceExhausted,
)

Common Errors

Error Message	Cause	Solution
`google.api_core.exceptions.ResourceExhausted`	Gemini API rate limit exceeded	Reduce thread count or wait for quota reset
`google.api_core.exceptions.TooManyRequests`	Too many concurrent Gemini requests	Set `EVALS_SEQUENTIAL=1` or reduce `EVALS_THREADS`
`anthropic.AuthenticationError`	Invalid or missing Anthropic API key	Set `ANTHROPIC_API_KEY` environment variable
Gemini threading failures	Thread-safety issue with Gemini client	Set `EVALS_SEQUENTIAL=1` as a workaround

Compatibility Notes

Gemini threading: The Google Gemini solver has a known thread-safety issue. The solver pre-creates the generative client (`get_default_generative_client()`) to mitigate this, but tests still force sequential mode (`EVALS_SEQUENTIAL=1`).
Together AI chat models: The Together solver uses a brittle `is_chat_model` check (acknowledged in a NOTE comment at `together_solver.py:13`) to determine whether to use chat vs completion API format.
Anthropic context limits: The Anthropic solver has a TODO noting that context length handling is not yet implemented, pending availability of the Anthropic tokenizer (`anthropic_solver.py:51`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment