Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Testtimescaling Testtimescaling github io Semantic Scholar API

From Leeroopedia


Knowledge Sources
Domains Infrastructure, API_Integration
Last Updated 2026-02-14 00:00 GMT

Overview

External API dependency on the Semantic Scholar Graph API, used to retrieve citation counts for arXiv papers.

Description

The citation tracking pipeline depends on the Semantic Scholar Graph API as its sole data source for citation counts. The API is accessed via a public REST endpoint that accepts arXiv paper identifiers and returns structured metadata including the citationCount field.

The API requires no authentication for basic usage. The endpoint used is:

https://api.semanticscholar.org/graph/v1/paper/ARXIV:{arxiv_id}?fields=citationCount

This is a third-party service outside the repository's control. Its availability, rate limits, and data freshness are external constraints that affect the citation tracking pipeline.

Usage

This environment dependency is activated whenever the Automated Citation Tracking workflow executes the update_arxiv_citations.py script. The script sends one HTTP GET request per tracked paper to this API.

System Requirements

Category Requirement Notes
Network Outbound HTTPS to api.semanticscholar.org (port 443) Must be reachable from the GitHub Actions runner
Protocol HTTP/1.1 or HTTP/2 with TLS Standard HTTPS connection
Latency Varies (typically < 2 seconds per request) Script uses 10-second timeout per request

Dependencies

External Service

  • Semantic Scholar Graph API v1 (graph/v1/paper/ endpoint)
  • Paper identifier format: ARXIV:{id} prefix required
  • Response format: JSON with citationCount field

Required Query Parameters

  • fields=citationCount -- Limits the response to only the citation count field

Credentials

No authentication is required for public API access. However, the Semantic Scholar API does offer optional API keys for higher rate limits:

  • S2_API_KEY (optional): If rate limits become an issue, an API key can be obtained from Semantic Scholar and passed via the x-api-key header.

The current implementation does not use an API key.

Quick Install

# No installation required. The API is a public web service.
# Test connectivity with curl:
curl "https://api.semanticscholar.org/graph/v1/paper/ARXIV:2503.24235?fields=citationCount"

# Expected response:
# {"paperId": "...", "citationCount": <number>}

Code Evidence

API endpoint construction from .github/scripts/update_arxiv_citations.py:L9:

url = f"https://api.semanticscholar.org/graph/v1/paper/ARXIV:{arxiv_id}?fields=citationCount"

HTTP request with timeout from .github/scripts/update_arxiv_citations.py:L10-14:

try:
    r = requests.get(url, timeout=10)
    r.raise_for_status()
    data = r.json()
    return data.get("citationCount", 0)

Error handling from .github/scripts/update_arxiv_citations.py:L15-17:

except Exception as e:
    print(f"[Warning] Failed to fetch citation for ArXiv:{arxiv_id} - {e}")
    return 0

Common Errors

Error Message Cause Solution
[Warning] Failed to fetch citation for ArXiv:... - 404 Client Error Paper not indexed by Semantic Scholar Verify the arXiv ID is correct. Newly published papers may not be indexed yet.
[Warning] Failed to fetch citation for ArXiv:... - ConnectionError Network connectivity issue or API downtime Check runner network access. The script gracefully returns 0 and continues.
[Warning] Failed to fetch citation for ArXiv:... - ReadTimeout API response exceeds 10-second timeout Transient issue. The script returns 0 and the next daily run will retry.
429 Too Many Requests API rate limit exceeded Reduce request frequency or obtain an API key for higher limits

Compatibility Notes

  • Rate limits: The public API allows approximately 100 requests per 5-minute window without an API key. The current repository tracks only 1 paper, well within limits.
  • Data freshness: Semantic Scholar updates citation counts periodically (not real-time). There may be a lag between a new citation appearing on Google Scholar and being reflected in the API.
  • Paper coverage: Not all arXiv papers are indexed by Semantic Scholar. Papers may take several days after publication to appear in the index.
  • API versioning: The graph/v1 endpoint is the current stable version. Breaking changes would require updating the URL in the script.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment