Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Openai Evals Optional Provider APIs

From Leeroopedia
Knowledge Sources
Domains Infrastructure, API_Configuration
Last Updated 2026-02-14 10:00 GMT

Overview

API key configuration for optional third-party LLM providers: Anthropic, Google Gemini, and Together AI.

Description

The OpenAI Evals framework supports evaluating models from multiple LLM providers beyond OpenAI. Each provider requires its own API key configured via environment variables. These are optional dependencies: they are only needed when running evaluations against the specific provider's models. The provider solver implementations are found under `evals/solvers/providers/`.

Usage

Use this environment when running evaluations against non-OpenAI models, such as Anthropic Claude, Google Gemini, or Together AI hosted models. Each provider's API key is only required when that specific solver is invoked.

System Requirements

Category Requirement Notes
Network Internet access Required for provider API calls
API Accounts Account with desired provider Anthropic, Google AI Studio, or Together AI

Dependencies

Python Packages

  • `anthropic` (for Anthropic/Claude models)
  • `google-generativeai` (for Google Gemini models)
  • `google-api-core` (transitive dependency of google-generativeai)

Credentials

The following environment variables are needed depending on the provider:

  • `GEMINI_API_KEY`: Google Gemini API key. Used in `evals/solvers/providers/google/gemini_solver.py:16`.
  • `TOGETHER_API_KEY`: Together AI API key. Used in `evals/solvers/providers/together/together_solver.py:59`.

Note: The Anthropic solver uses the `anthropic` library which reads `ANTHROPIC_API_KEY` from the environment by default (standard anthropic SDK behavior).

Quick Install

# Anthropic (Claude)
export ANTHROPIC_API_KEY="sk-ant-..."

# Google Gemini
export GEMINI_API_KEY="AI..."

# Together AI
export TOGETHER_API_KEY="..."

Code Evidence

Gemini API key loading from `evals/solvers/providers/google/gemini_solver.py:16-17`:

API_KEY = os.environ.get("GEMINI_API_KEY")
genai.configure(api_key=API_KEY)

Together API key loading from `evals/solvers/providers/together/together_solver.py:59`:

return os.environ.get("TOGETHER_API_KEY")

Anthropic client initialization from `evals/solvers/providers/anthropic/anthropic_solver.py:3-4`:

import anthropic
from anthropic import Anthropic

Gemini retry exceptions from `evals/solvers/providers/google/gemini_solver.py:37-41`:

GEMINI_RETRY_EXCEPTIONS = (
    google.api_core.exceptions.RetryError,
    google.api_core.exceptions.TooManyRequests,
    google.api_core.exceptions.ResourceExhausted,
)

Common Errors

Error Message Cause Solution
`google.api_core.exceptions.ResourceExhausted` Gemini API rate limit exceeded Reduce thread count or wait for quota reset
`google.api_core.exceptions.TooManyRequests` Too many concurrent Gemini requests Set `EVALS_SEQUENTIAL=1` or reduce `EVALS_THREADS`
`anthropic.AuthenticationError` Invalid or missing Anthropic API key Set `ANTHROPIC_API_KEY` environment variable
Gemini threading failures Thread-safety issue with Gemini client Set `EVALS_SEQUENTIAL=1` as a workaround

Compatibility Notes

  • Gemini threading: The Google Gemini solver has a known thread-safety issue. The solver pre-creates the generative client (`get_default_generative_client()`) to mitigate this, but tests still force sequential mode (`EVALS_SEQUENTIAL=1`).
  • Together AI chat models: The Together solver uses a brittle `is_chat_model` check (acknowledged in a NOTE comment at `together_solver.py:13`) to determine whether to use chat vs completion API format.
  • Anthropic context limits: The Anthropic solver has a TODO noting that context length handling is not yet implemented, pending availability of the Anthropic tokenizer (`anthropic_solver.py:51`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment