Heuristic:HKUDS AI Trader Linear Retry Backoff
| Knowledge Sources | |
|---|---|
| Domains | Reliability, LLM_Agents |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
Linear backoff retry strategy for LLM API calls: delay = base_delay * attempt_number, with 3 max retries and 30-second timeout per call.
Description
The AI-Trader agent uses a linear (not exponential) backoff strategy when LLM API calls fail. Each retry waits base_delay * attempt seconds before the next attempt, producing delays of 0.5s, 1.0s, 1.5s (with default base_delay=0.5) or 1.0s, 2.0s, 3.0s (with base_delay=1.0). This is combined with a recursion limit of 100 for the LangChain agent invocation and a 30-second timeout on the underlying HTTP client.
Usage
Apply this pattern when dealing with transient LLM API failures such as rate limiting (429), server errors (500/503), or network timeouts. The linear backoff is lighter than exponential backoff, suitable for APIs where brief pauses are sufficient to recover. Configure max_retries and base_delay in the agent config JSON.
The Insight (Rule of Thumb)
- Action: Set
max_retries=3,base_delay=1.0in agent_config, andtimeout=30on the ChatOpenAI client. - Value: 3 retries with linear delay (1s, 2s, 3s) = max 6 seconds total wait before failure.
- Trade-off: Linear backoff may not be sufficient for sustained rate limiting. For heavy concurrent usage, exponential backoff or circuit-breaker patterns would be more robust.
- Recursion Limit: Set to 100 for agent reasoning chains to prevent infinite tool-calling loops.
Reasoning
LLM API failures are typically transient (network blips, momentary rate limits). A short linear backoff covers the common case without adding unnecessary latency. The 30-second timeout prevents hanging on unresponsive endpoints. The recursion limit of 100 is generous enough for complex multi-step trading decisions while preventing runaway loops. These defaults are configured in configs/default_config.json and can be tuned per deployment.
Code Evidence
Retry loop from agent/base_agent/base_agent.py:423-435:
async def _ainvoke_with_retry(self, message: List[Dict[str, str]]) -> Any:
"""Agent invocation with retry"""
for attempt in range(1, self.max_retries + 1):
try:
if self.verbose:
print(f"Calling LLM API ({self.basemodel})...")
return await self.agent.ainvoke({"messages": message}, {"recursion_limit": 100})
except Exception as e:
if attempt == self.max_retries:
raise e
print(f"Attempt {attempt} failed, retrying after {self.base_delay * attempt} seconds...")
print(f"Error details: {e}")
await asyncio.sleep(self.base_delay * attempt)
ChatOpenAI client configuration from agent/base_agent/base_agent.py:387-397:
self.model = ChatOpenAI(
model=self.basemodel,
base_url=self.openai_base_url,
api_key=self.openai_api_key,
max_retries=3,
timeout=30,
)
Default config values from configs/default_config.json:
"agent_config": {
"max_steps": 30,
"max_retries": 3,
"base_delay": 1.0,
"initial_cash": 10000.0,
"verbose": true
}