Implementation:FlagOpen FlagEmbedding Reinforced IR GPTAgent
| Knowledge Sources | |
|---|---|
| Domains | Language Models, API Integration, Data Generation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
OpenAI GPT API wrapper for parallel text generation in reinforced information retrieval data pipeline.
Description
This class provides a unified interface for generating text using OpenAI's GPT models (including GPT-4 and GPT-4o-mini) with support for parallel processing and robust error handling. It manages API connections, handles retries on failures, and supports both single and batch generation with multithreading for throughput optimization.
The agent supports flexible generation parameters including temperature, top_p, max_tokens, and beam search. It includes two generation modes: standard prompt-based generation and direct message-based generation for more complex conversational contexts. The implementation uses ThreadPoolExecutor for efficient parallel API calls and includes automatic retry logic with exponential backoff.
Usage
Use this agent to generate synthetic queries, answers, or other text for training data augmentation in the Reinforced IR pipeline, particularly when OpenAI models are preferred over local LLMs.
Code Reference
Source Location
- Repository: FlagOpen_FlagEmbedding
- File: research/Reinforced_IR/data_generation/agent/gpt.py
- Lines: 1-144
Signature
class GPTAgent:
def __init__(
self,
model_name: str = "gpt-4o-mini",
api_key: str = None,
base_url: str = None,
n: int = 1
)
def generate(
self,
prompts,
use_beam_search: bool = False,
api_key: str = None,
temperature: float = 0,
top_p: float = 1,
max_tokens: int = 300,
thread_count: int = None
):
"""Generate text for multiple prompts in parallel"""
def generate_direct(self, prompts, thread_count: int = None):
"""Generate from message list format"""
Import
from openai import OpenAI
from concurrent.futures import ThreadPoolExecutor
from tqdm import tqdm
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | str | Yes | OpenAI model name (e.g., "gpt-4o-mini") |
| api_key | str | No | OpenAI API key (can use environment variable) |
| base_url | str | No | Custom API base URL |
| prompts | str/List[str] | Yes | Single prompt or list of prompts |
| temperature | float | No | Sampling temperature (default: 0) |
| top_p | float | No | Nucleus sampling parameter (default: 1) |
| max_tokens | int | No | Maximum tokens to generate (default: 300) |
| thread_count | int | No | Parallel threads (default: CPU count) |
Outputs
| Name | Type | Description |
|---|---|---|
| results | str/List[str] | Generated text (single string if n=1, list if n>1) |
Usage Examples
# Initialize GPT agent
agent = GPTAgent(
model_name="gpt-4o-mini",
api_key="your-api-key",
n=1
)
# Single prompt generation
prompt = "Generate a search query about machine learning"
result = agent.generate(
prompts=prompt,
temperature=0.7,
top_p=0.9,
max_tokens=100
)
print(result)
# Batch generation with parallel processing
prompts = [
"Generate a query about neural networks",
"Generate a query about deep learning",
"Generate a query about transformers"
]
results = agent.generate(
prompts=prompts,
temperature=0.8,
thread_count=4
)
# Direct message-based generation
messages = [
[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Generate a query"}
]
]
results = agent.generate_direct(messages, thread_count=2)
# Beam search generation
result = agent.generate(
prompts="Generate the best query",
use_beam_search=True,
temperature=1.0
)