Environment:Testtimescaling Testtimescaling github io Semantic Scholar API
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, API_Integration |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
External API dependency on the Semantic Scholar Graph API, used to retrieve citation counts for arXiv papers.
Description
The citation tracking pipeline depends on the Semantic Scholar Graph API as its sole data source for citation counts. The API is accessed via a public REST endpoint that accepts arXiv paper identifiers and returns structured metadata including the citationCount field.
The API requires no authentication for basic usage. The endpoint used is:
https://api.semanticscholar.org/graph/v1/paper/ARXIV:{arxiv_id}?fields=citationCount
This is a third-party service outside the repository's control. Its availability, rate limits, and data freshness are external constraints that affect the citation tracking pipeline.
Usage
This environment dependency is activated whenever the Automated Citation Tracking workflow executes the update_arxiv_citations.py script. The script sends one HTTP GET request per tracked paper to this API.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Network | Outbound HTTPS to api.semanticscholar.org (port 443) |
Must be reachable from the GitHub Actions runner |
| Protocol | HTTP/1.1 or HTTP/2 with TLS | Standard HTTPS connection |
| Latency | Varies (typically < 2 seconds per request) | Script uses 10-second timeout per request |
Dependencies
External Service
- Semantic Scholar Graph API v1 (
graph/v1/paper/endpoint) - Paper identifier format:
ARXIV:{id}prefix required - Response format: JSON with
citationCountfield
Required Query Parameters
fields=citationCount-- Limits the response to only the citation count field
Credentials
No authentication is required for public API access. However, the Semantic Scholar API does offer optional API keys for higher rate limits:
S2_API_KEY(optional): If rate limits become an issue, an API key can be obtained from Semantic Scholar and passed via thex-api-keyheader.
The current implementation does not use an API key.
Quick Install
# No installation required. The API is a public web service.
# Test connectivity with curl:
curl "https://api.semanticscholar.org/graph/v1/paper/ARXIV:2503.24235?fields=citationCount"
# Expected response:
# {"paperId": "...", "citationCount": <number>}
Code Evidence
API endpoint construction from .github/scripts/update_arxiv_citations.py:L9:
url = f"https://api.semanticscholar.org/graph/v1/paper/ARXIV:{arxiv_id}?fields=citationCount"
HTTP request with timeout from .github/scripts/update_arxiv_citations.py:L10-14:
try:
r = requests.get(url, timeout=10)
r.raise_for_status()
data = r.json()
return data.get("citationCount", 0)
Error handling from .github/scripts/update_arxiv_citations.py:L15-17:
except Exception as e:
print(f"[Warning] Failed to fetch citation for ArXiv:{arxiv_id} - {e}")
return 0
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
[Warning] Failed to fetch citation for ArXiv:... - 404 Client Error |
Paper not indexed by Semantic Scholar | Verify the arXiv ID is correct. Newly published papers may not be indexed yet. |
[Warning] Failed to fetch citation for ArXiv:... - ConnectionError |
Network connectivity issue or API downtime | Check runner network access. The script gracefully returns 0 and continues. |
[Warning] Failed to fetch citation for ArXiv:... - ReadTimeout |
API response exceeds 10-second timeout | Transient issue. The script returns 0 and the next daily run will retry. |
429 Too Many Requests |
API rate limit exceeded | Reduce request frequency or obtain an API key for higher limits |
Compatibility Notes
- Rate limits: The public API allows approximately 100 requests per 5-minute window without an API key. The current repository tracks only 1 paper, well within limits.
- Data freshness: Semantic Scholar updates citation counts periodically (not real-time). There may be a lag between a new citation appearing on Google Scholar and being reflected in the API.
- Paper coverage: Not all arXiv papers are indexed by Semantic Scholar. Papers may take several days after publication to appear in the index.
- API versioning: The
graph/v1endpoint is the current stable version. Breaking changes would require updating the URL in the script.