Environment:Open compass VLMEvalKit API Keys And Credentials
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, API_Integration, Security |
| Last Updated | 2026-02-14 01:30 GMT |
Overview
API keys and credentials environment for VLMEvalKit, loaded from a `.env` file at the repository root via `python-dotenv`.
Description
VLMEvalKit supports evaluation of numerous API-based VLMs (GPT-4o, Claude, Gemini, Qwen, etc.) and uses LLM judges for open-ended evaluation. All API credentials are loaded from a `.env` file located at the repository root by the `load_env()` function in `vlmeval/smp/misc.py`. This function is called automatically when the `vlmeval` package is imported (`vlmeval/__init__.py:11`). The keys are injected into `os.environ` and used by API wrappers throughout the codebase.
Usage
Use this environment for any workflow that involves API model inference (API Model Evaluation workflow) or LLM-as-judge evaluation (any benchmark requiring GPT-4o, GPT-4-turbo, or other judge models). Without proper API keys, API-based evaluations will fail with authentication errors.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| File | `.env` file at repo root | Must be alongside the `vlmeval/` directory |
| Network | Internet access | Required for API calls to OpenAI, Google, Anthropic, etc. |
| Proxy | Optional HTTP proxy | Configurable via `EVAL_PROXY` environment variable |
Dependencies
Python Packages
- `python-dotenv` (for `.env` file parsing)
- `openai` (for OpenAI API models)
- `google-genai` (for Gemini API models)
- `requests` (for generic HTTP API calls)
Credentials
The following environment variables should be set in the `.env` file. Never commit actual key values to version control.
API Model Keys
- `OPENAI_API_KEY`: OpenAI API key (must start with `sk-`). Checked by `gpt_key_set()` in `vlmeval/smp/vlm.py:189`.
- `AZURE_OPENAI_API_KEY`: Azure OpenAI API key (fallback if `OPENAI_API_KEY` not set). Checked in `vlmeval/smp/vlm.py:191`.
- `O1_API_KEY`: Separate API key for OpenAI o1/o3 models (`vlmeval/config.py:84`).
- `O1_API_BASE`: API base URL for o1/o3 models (`vlmeval/config.py:85`).
Platform and Proxy Keys
- `EVAL_PROXY`: HTTP proxy URL for evaluation API calls (`run.py:459`). Temporarily overrides `HTTP_PROXY` during evaluation.
- `HTTP_PROXY` / `HTTPS_PROXY`: Standard HTTP proxy variables.
- `FWD_API`: When set to `'1'`, forces all API models to use the `GPT4V` class (`run.py:234`).
Data and Output Configuration
- `VLMEVALKIT_USE_MODELSCOPE`: Set to `'1'` or `'True'` to use ModelScope instead of HuggingFace for model/dataset downloads (`vlmeval/smp/misc.py:30`).
- `LMUData`: Custom path for dataset storage root. Defaults to `~/LMUData` (`vlmeval/smp/file.py:70`).
- `HUGGINGFACE_HUB_CACHE` / `HF_HOME`: HuggingFace model cache directory (`vlmeval/smp/file.py:79`).
- `MMEVAL_ROOT`: Override for the output work directory (`run.py:221-222`).
Inference Configuration
- `PRED_FORMAT`: Output prediction format. One of `tsv`, `xlsx`, `json`. Default: `xlsx` (`vlmeval/smp/file.py:174`).
- `EVAL_FORMAT`: Output evaluation format. One of `csv`, `json`. Default: `csv` (`vlmeval/smp/file.py:183`).
- `VLMEVAL_MAX_IMAGE_SIZE`: Maximum image size in bytes for base64 encoding. Default: 1e9 (`vlmeval/smp/vlm.py:110`).
- `VLMEVAL_MIN_IMAGE_EDGE`: Minimum image edge length in pixels. Default: 100 (`vlmeval/smp/vlm.py:111`).
- `SKIP_ERR`: Set to `'1'` to gracefully handle runtime errors during inference instead of crashing (`vlmeval/inference.py:158`).
- `SPLIT_THINK`: When set, splits model responses into thinking and prediction parts for chain-of-thought models (`vlmeval/inference.py:225`).
Model-Specific
- `EVAL_MODEL`: Override model path for Aguvis configuration. Default: `xlangai/Aguvis-7B-720P` (`vlmeval/config.py:1848-1850`).
- `ENV_433`, `ENV_437`, `ENV_440`, `ENV_latest`: Python environment paths for different transformers versions, used by `vlmutil run` (`vlmeval/tools.py:360-363`).
Quick Install
# Create .env file at repository root
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-key-here
# Optional: Azure fallback
# AZURE_OPENAI_API_KEY=your-azure-key
# Optional: Proxy for evaluation
# EVAL_PROXY=http://proxy:8080
EOF
Code Evidence
`.env` file loading from `vlmeval/smp/misc.py:200-223`:
def load_env():
import logging
logging.basicConfig(...)
try:
import vlmeval
except ImportError:
logging.error('VLMEval is not installed. Failed to import environment variables from .env file. ')
return
pth = osp.realpath(vlmeval.__path__[0])
pth = osp.join(pth, '../.env')
pth = osp.realpath(pth)
if not osp.exists(pth):
logging.error(f'Did not detect the .env file at {pth}, failed to load. ')
return
from dotenv import dotenv_values
values = dotenv_values(pth)
for k, v in values.items():
if v is not None and len(v):
os.environ[k] = v
logging.info(f'API Keys successfully loaded from {pth}')
OpenAI key validation from `vlmeval/smp/vlm.py:188-193`:
def gpt_key_set():
openai_key = os.environ.get('OPENAI_API_KEY', None)
if openai_key is None:
openai_key = os.environ.get('AZURE_OPENAI_API_KEY', None)
return isinstance(openai_key, str)
return isinstance(openai_key, str) and openai_key.startswith('sk-')
FWD_API forwarding from `run.py:234-241`:
if os.environ.get('FWD_API', None) == '1':
from vlmeval.config import api_models as supported_APIs
from vlmeval.api import GPT4V
for m in args.model:
if m in supported_APIs:
kws = supported_VLM[m].keywords
supported_VLM[m] = partial(GPT4V, **kws)
logger.warning(f'FWD_API is set, will use class `GPT4V` for {m}')
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `Did not detect the .env file at {pth}, failed to load.` | Missing `.env` file | Create `.env` at repo root with required keys |
| `Failed to obtain answer via API.` | API key invalid or missing | Verify `OPENAI_API_KEY` in `.env` starts with `sk-` |
| `Unsupported PRED_FORMAT xxx` | Invalid `PRED_FORMAT` env var | Use one of: `tsv`, `xlsx`, `json` |
| `Unsupported EVAL_FORMAT xxx` | Invalid `EVAL_FORMAT` env var | Use one of: `csv`, `json` |
Compatibility Notes
- ModelScope fallback: Setting `VLMEVALKIT_USE_MODELSCOPE=1` switches dataset/model downloads from HuggingFace to ModelScope, useful in regions where HuggingFace is blocked.
- Proxy handling: The `EVAL_PROXY` variable is temporarily set during evaluation and restored afterward (`run.py:459-480`), ensuring it does not affect inference API calls.
- FWD_API mode: When `FWD_API=1`, all API models are routed through a single `GPT4V` wrapper class, useful for API gateway/proxy setups.