Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Openai Evals OpenAI API Configuration

From Leeroopedia
Knowledge Sources
Domains Infrastructure, API_Configuration
Last Updated 2026-02-14 10:00 GMT

Overview

OpenAI API key and runtime configuration environment for executing evaluations against OpenAI models.

Description

This environment defines the mandatory `OPENAI_API_KEY` credential and the suite of `EVALS_*` runtime configuration variables that control eval execution behavior. The OpenAI API key is loaded at module import time in `evals/registry.py` and is required for all evaluations that call OpenAI models. The `EVALS_*` variables control threading, timeouts, progress display, and sequential execution mode.

Usage

Use this environment whenever running evaluations against OpenAI models (e.g., gpt-3.5-turbo, gpt-4). The `OPENAI_API_KEY` is the single mandatory credential. The `EVALS_*` variables are optional but recommended for tuning performance and debugging.

System Requirements

Category Requirement Notes
Network Internet access Required for OpenAI API calls
API Account OpenAI account with API access See https://platform.openai.com/account/api-keys

Dependencies

Python Packages

  • `openai` >= 1.0.0

Credentials

The following environment variables must be set:

  • `OPENAI_API_KEY`: Required. OpenAI API key with read access. Used across `evals/registry.py:26`, `evals/completion_fns/openai.py`, `evals/completion_fns/retrieval.py`, and multiple eval suites.

Runtime Configuration Variables

  • `EVALS_THREADS`: Number of parallel threads for eval execution. Default: `10`. Used in `evals/eval.py:124`.
  • `EVALS_THREAD_TIMEOUT`: Timeout in seconds per thread before restart. Default: `40`. Used in `evals/utils/api_utils.py:6`.
  • `EVALS_SEQUENTIAL`: Set to `1`, `true`, or `yes` to run evals sequentially instead of in parallel. Default: `0`. Used in `evals/eval.py:140`.
  • `EVALS_SHOW_EVAL_PROGRESS`: Show progress bar during eval execution. Used in `evals/eval.py:125`.
  • `EVALS_GENTLE_INTERRUPT`: Enable gentle interrupt handling. Used in `evals/eval.py:242`.

Optional Snowflake Logging Credentials

  • `SNOWFLAKE_ACCOUNT`: Snowflake account identifier.
  • `SNOWFLAKE_DATABASE`: Snowflake database name.
  • `SNOWFLAKE_USERNAME`: Snowflake login username.
  • `SNOWFLAKE_PASSWORD`: Snowflake login password.

Quick Install

# Set required API key
export OPENAI_API_KEY="sk-..."

# Optional: configure threading for faster execution
export EVALS_THREADS=20
export EVALS_THREAD_TIMEOUT=120

# Optional: run sequentially for debugging
export EVALS_SEQUENTIAL=1

# Optional: Snowflake logging
export SNOWFLAKE_ACCOUNT="your-account"
export SNOWFLAKE_DATABASE="your-db"
export SNOWFLAKE_USERNAME="your-user"
export SNOWFLAKE_PASSWORD="your-password"

Code Evidence

OpenAI client initialization from `evals/registry.py:26`:

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

Thread configuration from `evals/eval.py:124-125`:

threads = int(os.environ.get("EVALS_THREADS", "10"))
show_progress = bool(os.environ.get("EVALS_SHOW_EVAL_PROGRESS", show_progress))

Sequential mode check from `evals/eval.py:140-144`:

if os.environ.get("EVALS_SEQUENTIAL", "0") in {"1", "true", "yes"}:
    logger.info("Running in sequential mode!")
    iter = map(eval_sample, work_items)
else:
    logger.info(f"Running in threaded mode with {threads} threads!")
    iter = pool.imap_unordered(eval_sample, work_items)

Thread timeout from `evals/utils/api_utils.py:6`:

EVALS_THREAD_TIMEOUT = float(os.environ.get("EVALS_THREAD_TIMEOUT", "40"))

Common Errors

Error Message Cause Solution
`openai.AuthenticationError: Incorrect API key` Invalid or missing OPENAI_API_KEY Verify your API key at https://platform.openai.com/account/api-keys
`openai.RateLimitError` Too many requests Reduce `EVALS_THREADS` or wait for rate limit reset
`openai.APITimeoutError` Request exceeded timeout Increase `EVALS_THREAD_TIMEOUT` (e.g., to 120 or 600)
`ValueError: human_cli player is available only with EVALS_SEQUENTIAL=1` Human CLI solver requires sequential mode Set `EVALS_SEQUENTIAL=1` before running

Compatibility Notes

  • Rate limits: Running with more threads increases throughput but may trigger OpenAI rate limits. Monitor your usage tier and adjust `EVALS_THREADS` accordingly.
  • Long prompts: For evals with long prompts or responses, increase `EVALS_THREAD_TIMEOUT` beyond the default 40 seconds.
  • Human-in-the-loop: Evals using `HumanCliSolver` (e.g., bluff eval) require `EVALS_SEQUENTIAL=1` since interactive CLI input cannot be parallelized.
  • Gemini solver: The Google Gemini solver has a known threading issue and tests force `EVALS_SEQUENTIAL=1` as a workaround (`evals/solvers/providers/google/gemini_solver_test.py:22`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment