Implementation:FlagOpen FlagEmbedding Reinforced IR GPTAgent

Knowledge Sources	FlagOpen_FlagEmbedding
Domains	Language Models, API Integration, Data Generation
Last Updated	2026-02-09 00:00 GMT

Overview

OpenAI GPT API wrapper for parallel text generation in reinforced information retrieval data pipeline.

Description

This class provides a unified interface for generating text using OpenAI's GPT models (including GPT-4 and GPT-4o-mini) with support for parallel processing and robust error handling. It manages API connections, handles retries on failures, and supports both single and batch generation with multithreading for throughput optimization.

The agent supports flexible generation parameters including temperature, top_p, max_tokens, and beam search. It includes two generation modes: standard prompt-based generation and direct message-based generation for more complex conversational contexts. The implementation uses ThreadPoolExecutor for efficient parallel API calls and includes automatic retry logic with exponential backoff.

Usage

Use this agent to generate synthetic queries, answers, or other text for training data augmentation in the Reinforced IR pipeline, particularly when OpenAI models are preferred over local LLMs.

Code Reference

Source Location

Repository: FlagOpen_FlagEmbedding
File: research/Reinforced_IR/data_generation/agent/gpt.py
Lines: 1-144

Signature

class GPTAgent:
    def __init__(
        self,
        model_name: str = "gpt-4o-mini",
        api_key: str = None,
        base_url: str = None,
        n: int = 1
    )

    def generate(
        self,
        prompts,
        use_beam_search: bool = False,
        api_key: str = None,
        temperature: float = 0,
        top_p: float = 1,
        max_tokens: int = 300,
        thread_count: int = None
    ):
        """Generate text for multiple prompts in parallel"""

    def generate_direct(self, prompts, thread_count: int = None):
        """Generate from message list format"""

Import

from openai import OpenAI
from concurrent.futures import ThreadPoolExecutor
from tqdm import tqdm

I/O Contract

Inputs

Name	Type	Required	Description
model_name	str	Yes	OpenAI model name (e.g., "gpt-4o-mini")
api_key	str	No	OpenAI API key (can use environment variable)
base_url	str	No	Custom API base URL
prompts	str/List[str]	Yes	Single prompt or list of prompts
temperature	float	No	Sampling temperature (default: 0)
top_p	float	No	Nucleus sampling parameter (default: 1)
max_tokens	int	No	Maximum tokens to generate (default: 300)
thread_count	int	No	Parallel threads (default: CPU count)

Outputs

Name	Type	Description
results	str/List[str]	Generated text (single string if n=1, list if n>1)

Usage Examples

# Initialize GPT agent
agent = GPTAgent(
    model_name="gpt-4o-mini",
    api_key="your-api-key",
    n=1
)

# Single prompt generation
prompt = "Generate a search query about machine learning"
result = agent.generate(
    prompts=prompt,
    temperature=0.7,
    top_p=0.9,
    max_tokens=100
)
print(result)

# Batch generation with parallel processing
prompts = [
    "Generate a query about neural networks",
    "Generate a query about deep learning",
    "Generate a query about transformers"
]
results = agent.generate(
    prompts=prompts,
    temperature=0.8,
    thread_count=4
)

# Direct message-based generation
messages = [
    [
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Generate a query"}
    ]
]
results = agent.generate_direct(messages, thread_count=2)

# Beam search generation
result = agent.generate(
    prompts="Generate the best query",
    use_beam_search=True,
    temperature=1.0
)

Related Pages

Principle:FlagOpen_FlagEmbedding_Reinforced_Domain_Adaptation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment