Heuristic:ThreeSR Awesome Inference Time Scaling API Rate Limiting Tip

Knowledge Sources	Awesome-Inference-Time-Scaling Semantic Scholar API
Domains	API_Integration, Optimization
Last Updated	2026-02-14 00:00 GMT

Overview

Rate limiting strategy for Semantic Scholar API calls to avoid HTTP 429 errors when fetching multiple papers in sequence.

Description

The fetch_semantic_info.py script makes multiple sequential HTTP requests to the Semantic Scholar API: one search request followed by one detail request per paper (via get_paper_info() called inside format_paper_info()). Without delays between requests, the script risks hitting Semantic Scholar's rate limits, which would result in non-200 responses and missing data.

The script author included a commented-out time.sleep(10) between the search and write phases, indicating awareness of this issue. However, the more critical bottleneck is the per-paper detail request loop inside write_to_readme_in_sorted_order(), where format_paper_info() is called for each new paper, and each call makes an additional API request.

Usage

Use this heuristic when:

Increasing the LIMIT parameter beyond the default of 1 to fetch multiple papers at once.
Experiencing intermittent API failures (non-200 status codes) during script execution.
Running the script frequently (e.g., in a cron job or CI pipeline).

The Insight (Rule of Thumb)

Action: Add a time.sleep() delay between API calls, especially inside the loop where format_paper_info() is called for each paper.
Value: A delay of 1-3 seconds between requests is typically sufficient for unauthenticated Semantic Scholar API access. The author's commented value of 10 seconds is conservative but safe.
Trade-off: Slower script execution (additional N seconds per paper fetched) in exchange for reliable API responses and complete data.

Reasoning

Semantic Scholar's public API enforces rate limits on unauthenticated requests. The exact limits are not publicly documented with precision, but empirical experience suggests approximately 100 requests per 5 minutes for unauthenticated users. When the limit is exceeded, the API returns HTTP 429 (Too Many Requests) or other non-200 status codes.

The current script architecture makes N+1 API calls for N new papers: 1 search request + N detail requests. With the default LIMIT=1, this is just 2 requests, which is unlikely to trigger rate limiting. However, if the limit is increased or the script is run in rapid succession, rate limiting becomes a real concern.

The commented-out code at line 217 is evidence that the author encountered or anticipated this issue:

# Optionally: pause between queries to avoid too frequent requests
# time.sleep(10)

A more robust approach would be to add delays inside the format_paper_info() call chain or to implement exponential backoff on non-200 responses.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment