Environment:OpenBMB UltraFeedback OpenAI API Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, NLP, Annotation |
| Last Updated | 2026-02-08 06:00 GMT |
Overview
Python environment with the `openai` package configured with a valid API key for GPT-4 and GPT-3.5-Turbo access.
Description
This environment provides access to the OpenAI ChatCompletion API, specifically the `gpt-4` and `gpt-3.5-turbo` models. It is used across three distinct pipelines: (1) completion generation via the API_Caller class in main.py, (2) critique annotation in annotate_critique.py, and (3) fine-grained preference annotation in annotate_preference.py. The code uses the legacy `openai.ChatCompletion.create()` interface (pre-v1.0 openai package). All three annotation files set the API key via `openai.api_key = "PUT YOUR KEY HERE"`, which must be replaced with an actual key before execution. The score correction pipeline in fix_overall_score_issue.py also requires GPT-4 API access for re-annotation of ambiguous scores.
Usage
Use this environment for any workflow that requires GPT-4 annotation, GPT-4 critique generation, or API-based completion generation. This is a mandatory prerequisite for the GPT4_Critique_Annotator, GPT4_Preference_Annotator, Score_Correction_Pipeline, and the API branch of Multi_Backend_Inference.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Any (Linux, macOS, Windows) | No OS-specific requirements; API-only |
| Hardware | No GPU required | All computation happens server-side at OpenAI |
| Network | Internet access to api.openai.com | Stable connection required; retry logic handles transient failures |
Dependencies
Python Packages
- `openai` < 1.0 (legacy ChatCompletion.create interface)
- `requests` (used in annotate_preference.py)
- `datasets`
- `pandas`
- `tqdm`
- `json` (stdlib)
- `re` (stdlib)
Credentials
The following credentials must be configured before running annotation or API-based completion:
- `openai.api_key`: OpenAI API key with access to `gpt-4` and `gpt-3.5-turbo` models. Set directly in code (replace "PUT YOUR KEY HERE" placeholder in main.py:L93, annotate_critique.py:L15, annotate_preference.py:L15, fix_overall_score_issue.py:L7).
WARNING: The codebase hardcodes the API key placeholder directly in source files. For production use, inject the key via an environment variable instead.
Quick Install
# Install OpenAI SDK (legacy version, pre-v1.0)
pip install "openai<1.0"
# Additional runtime dependencies
pip install requests datasets pandas tqdm
Code Evidence
API key configuration from `main.py:93`:
openai.api_key = "PUT YOUR KEY HERE"
API key configuration from `annotate_critique.py:15`:
openai.api_key = "PUT YOUR KEY HERE"
API key configuration from `annotate_preference.py:15`:
openai.api_key = "PUT YOUR KEY HERE"
Legacy ChatCompletion API usage from `annotate_critique.py:49-61`:
response = openai.ChatCompletion.create(**{
"model": "gpt-4",
"messages": [
{"role": "system", "content": sys_prompt},
{"role": "user", "content": user_prompt}
],
"temperature": 0,
"max_tokens": 1024,
"top_p": 0.6,
"presence_penalty": 0,
"frequency_penalty": 0
})
MAX_API_RETRY constant from `annotate_preference.py:13`:
MAX_API_RETRY=10
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `openai.error.AuthenticationError` | Invalid or missing API key | Replace "PUT YOUR KEY HERE" with a valid OpenAI API key |
| `openai.error.RateLimitError` | Too many requests per minute | Retry logic already built in (up to 10-20 retries with sleep); reduce parallelism if persistent |
| `Exception("API Error")` | All retry attempts exhausted | Check API key permissions, account balance, and network connectivity |
| `AttributeError: module 'openai' has no attribute 'ChatCompletion'` | openai package >= 1.0 installed | Downgrade to `pip install "openai<1.0"` (code uses legacy interface) |
Compatibility Notes
- openai package version: The codebase uses the legacy `openai.ChatCompletion.create()` interface which was removed in openai >= 1.0. You must use `openai < 1.0`.
- Model availability: The code hardcodes `gpt-4` and `gpt-4-0613` model names. Ensure your API key has access to these models.
- Rate limits: Three separate files make independent API calls with their own retry loops. Running annotation in parallel across files may exceed rate limits.
- Cost: Annotating 64k prompts with 4 completions each across critique and 4-aspect preference annotation results in significant API costs.