Heuristic:Mbzuai oryx Awesome LLM Post training API Rate Limit Retry Strategy

Knowledge Sources	Awesome-LLM-Post-training Semantic Scholar API Rate Limits
Domains	API_Integration, Optimization
Last Updated	2026-02-08 08:00 GMT

Overview

Retry strategy for handling Semantic Scholar API HTTP 429 rate limits, with context-dependent retry counts (3 for search, 10 for count queries).

Description

The Semantic Scholar API enforces rate limits that return HTTP 429 responses when exceeded. The codebase uses two different retry strategies depending on the type of query: the search_papers function retries up to 3 times with a 10-second backoff, while the get_paper_count function retries up to 10 times with the same 10-second backoff. The fetch_paper_details function uses recursive self-retry on 429, effectively retrying indefinitely until the global paper limit is reached.

Usage

Apply this heuristic when integrating with rate-limited APIs, particularly the Semantic Scholar API. The different retry counts reflect the relative importance and cost of each query type: count queries are lightweight but numerous (one per keyword per year), while search queries are heavier but fewer.

The Insight (Rule of Thumb)

Action: Implement retry loops with exponential or fixed backoff for HTTP 429 responses.
Value: Use 3 retries for heavyweight search queries and 10 retries for lightweight count queries. Use a fixed 10-second backoff interval.
Trade-off: Higher retry counts increase total runtime but improve data completeness. The 10-second wait is conservative; shorter waits risk hitting the rate limit again.
Variant: The fetch_paper_details function retries indefinitely on 429 by recursively calling itself, which can lead to deep call stacks if the API is persistently rate-limited.

Reasoning

The Semantic Scholar API free tier allows approximately 100 requests per 5 minutes. The deep collection script can issue thousands of requests during recursive reference/citation traversal. The 10-second backoff was chosen as a conservative estimate that allows the rate limit window to partially reset. The asymmetric retry counts (3 vs 10) reflect that:

search_papers is called once at the start with few results, so 3 retries are sufficient.
get_paper_count is called many times (once per keyword per year), and each call is cheap, so 10 retries maximize data completeness.
fetch_paper_details is the core crawling function where data loss is most costly, justifying unlimited retries.

Code evidence from `scripts/deep_collection_sementic.py:17`:

rate_limit_wait = 10  # Increase wait time to 10s for API rate limits

Code evidence from `scripts/deep_collection_sementic.py:23`:

for _ in range(3):  # Retry up to 3 times if 429 error occurs
    response = requests.get(url)
    if response.status_code == 200:
        return response.json().get("data", [])
    elif response.status_code == 429:
        print(f"Rate limit exceeded. Retrying in {rate_limit_wait}s...")
        time.sleep(rate_limit_wait)

Code evidence from `scripts/future_research_data.py:13-20`:

for _ in range(10):  # Retry up to 10 times
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        data = response.json()
        return data.get('total', 0)
    elif response.status_code == 429:
        print(f"Rate limit exceeded for {year} on query '{query}'. Retrying in 10s...")
        time.sleep(10)

Code evidence from `scripts/deep_collection_sementic.py:50-53` (recursive retry):

if response.status_code == 429:
    print(f"API Rate limit reached. Retrying in {rate_limit_wait}s...")
    time.sleep(rate_limit_wait)
    return fetch_paper_details(paper_id, depth)  # Retry the same call

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment