Environment:Open compass VLMEvalKit API Keys And Credentials

Knowledge Sources	VLMEvalKit Quickstart
Domains	Infrastructure, API_Integration, Security
Last Updated	2026-02-14 01:30 GMT

Overview

API keys and credentials environment for VLMEvalKit, loaded from a `.env` file at the repository root via `python-dotenv`.

Description

VLMEvalKit supports evaluation of numerous API-based VLMs (GPT-4o, Claude, Gemini, Qwen, etc.) and uses LLM judges for open-ended evaluation. All API credentials are loaded from a `.env` file located at the repository root by the `load_env()` function in `vlmeval/smp/misc.py`. This function is called automatically when the `vlmeval` package is imported (`vlmeval/__init__.py:11`). The keys are injected into `os.environ` and used by API wrappers throughout the codebase.

Usage

Use this environment for any workflow that involves API model inference (API Model Evaluation workflow) or LLM-as-judge evaluation (any benchmark requiring GPT-4o, GPT-4-turbo, or other judge models). Without proper API keys, API-based evaluations will fail with authentication errors.

System Requirements

Category	Requirement	Notes
File	`.env` file at repo root	Must be alongside the `vlmeval/` directory
Network	Internet access	Required for API calls to OpenAI, Google, Anthropic, etc.
Proxy	Optional HTTP proxy	Configurable via `EVAL_PROXY` environment variable

Dependencies

Python Packages

`python-dotenv` (for `.env` file parsing)
`openai` (for OpenAI API models)
`google-genai` (for Gemini API models)
`requests` (for generic HTTP API calls)

Credentials

The following environment variables should be set in the `.env` file. Never commit actual key values to version control.

API Model Keys

`OPENAI_API_KEY`: OpenAI API key (must start with `sk-`). Checked by `gpt_key_set()` in `vlmeval/smp/vlm.py:189`.
`AZURE_OPENAI_API_KEY`: Azure OpenAI API key (fallback if `OPENAI_API_KEY` not set). Checked in `vlmeval/smp/vlm.py:191`.
`O1_API_KEY`: Separate API key for OpenAI o1/o3 models (`vlmeval/config.py:84`).
`O1_API_BASE`: API base URL for o1/o3 models (`vlmeval/config.py:85`).

Platform and Proxy Keys

`EVAL_PROXY`: HTTP proxy URL for evaluation API calls (`run.py:459`). Temporarily overrides `HTTP_PROXY` during evaluation.
`HTTP_PROXY` / `HTTPS_PROXY`: Standard HTTP proxy variables.
`FWD_API`: When set to `'1'`, forces all API models to use the `GPT4V` class (`run.py:234`).

Data and Output Configuration

`VLMEVALKIT_USE_MODELSCOPE`: Set to `'1'` or `'True'` to use ModelScope instead of HuggingFace for model/dataset downloads (`vlmeval/smp/misc.py:30`).
`LMUData`: Custom path for dataset storage root. Defaults to `~/LMUData` (`vlmeval/smp/file.py:70`).
`HUGGINGFACE_HUB_CACHE` / `HF_HOME`: HuggingFace model cache directory (`vlmeval/smp/file.py:79`).
`MMEVAL_ROOT`: Override for the output work directory (`run.py:221-222`).

Inference Configuration

`PRED_FORMAT`: Output prediction format. One of `tsv`, `xlsx`, `json`. Default: `xlsx` (`vlmeval/smp/file.py:174`).
`EVAL_FORMAT`: Output evaluation format. One of `csv`, `json`. Default: `csv` (`vlmeval/smp/file.py:183`).
`VLMEVAL_MAX_IMAGE_SIZE`: Maximum image size in bytes for base64 encoding. Default: 1e9 (`vlmeval/smp/vlm.py:110`).
`VLMEVAL_MIN_IMAGE_EDGE`: Minimum image edge length in pixels. Default: 100 (`vlmeval/smp/vlm.py:111`).
`SKIP_ERR`: Set to `'1'` to gracefully handle runtime errors during inference instead of crashing (`vlmeval/inference.py:158`).
`SPLIT_THINK`: When set, splits model responses into thinking and prediction parts for chain-of-thought models (`vlmeval/inference.py:225`).

Model-Specific

`EVAL_MODEL`: Override model path for Aguvis configuration. Default: `xlangai/Aguvis-7B-720P` (`vlmeval/config.py:1848-1850`).
`ENV_433`, `ENV_437`, `ENV_440`, `ENV_latest`: Python environment paths for different transformers versions, used by `vlmutil run` (`vlmeval/tools.py:360-363`).

Quick Install

# Create .env file at repository root
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-key-here
# Optional: Azure fallback
# AZURE_OPENAI_API_KEY=your-azure-key
# Optional: Proxy for evaluation
# EVAL_PROXY=http://proxy:8080
EOF

Code Evidence

`.env` file loading from `vlmeval/smp/misc.py:200-223`:

def load_env():
    import logging
    logging.basicConfig(...)
    try:
        import vlmeval
    except ImportError:
        logging.error('VLMEval is not installed. Failed to import environment variables from .env file. ')
        return
    pth = osp.realpath(vlmeval.__path__[0])
    pth = osp.join(pth, '../.env')
    pth = osp.realpath(pth)
    if not osp.exists(pth):
        logging.error(f'Did not detect the .env file at {pth}, failed to load. ')
        return
    from dotenv import dotenv_values
    values = dotenv_values(pth)
    for k, v in values.items():
        if v is not None and len(v):
            os.environ[k] = v
    logging.info(f'API Keys successfully loaded from {pth}')

OpenAI key validation from `vlmeval/smp/vlm.py:188-193`:

def gpt_key_set():
    openai_key = os.environ.get('OPENAI_API_KEY', None)
    if openai_key is None:
        openai_key = os.environ.get('AZURE_OPENAI_API_KEY', None)
        return isinstance(openai_key, str)
    return isinstance(openai_key, str) and openai_key.startswith('sk-')

FWD_API forwarding from `run.py:234-241`:

if os.environ.get('FWD_API', None) == '1':
    from vlmeval.config import api_models as supported_APIs
    from vlmeval.api import GPT4V
    for m in args.model:
        if m in supported_APIs:
            kws = supported_VLM[m].keywords
            supported_VLM[m] = partial(GPT4V, **kws)
            logger.warning(f'FWD_API is set, will use class `GPT4V` for {m}')

Common Errors

Error Message	Cause	Solution
`Did not detect the .env file at {pth}, failed to load.`	Missing `.env` file	Create `.env` at repo root with required keys
`Failed to obtain answer via API.`	API key invalid or missing	Verify `OPENAI_API_KEY` in `.env` starts with `sk-`
`Unsupported PRED_FORMAT xxx`	Invalid `PRED_FORMAT` env var	Use one of: `tsv`, `xlsx`, `json`
`Unsupported EVAL_FORMAT xxx`	Invalid `EVAL_FORMAT` env var	Use one of: `csv`, `json`

Compatibility Notes

ModelScope fallback: Setting `VLMEVALKIT_USE_MODELSCOPE=1` switches dataset/model downloads from HuggingFace to ModelScope, useful in regions where HuggingFace is blocked.
Proxy handling: The `EVAL_PROXY` variable is temporarily set during evaluation and restored afterward (`run.py:459-480`), ensuring it does not affect inference API calls.
FWD_API mode: When `FWD_API=1`, all API models are routed through a single `GPT4V` wrapper class, useful for API gateway/proxy setups.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment