Implementation:Testtimescaling Testtimescaling github io Fetch Arxiv Citation Count
| Metadata | |
|---|---|
| Page Type | Implementation |
| Implementation Type | API Doc |
| Domain | Data_Retrieval, API_Integration |
| Namespace | Testtimescaling_Testtimescaling_github_io |
| Workflow | Automated_Citation_Tracking |
| Date Created | 2026-02-14 |
| Principle | Principle:Testtimescaling_Testtimescaling_github_io_API_Data_Fetching |
| Knowledge Source | testtimescaling.github.io, Semantic Scholar API |
Overview
The fetch_arxiv_citation_count function retrieves the citation count for a given arXiv paper from the Semantic Scholar Graph API, returning 0 on any failure.
Description
This function is defined in the repository's automation script and serves as the core data retrieval component of the Automated Citation Tracking pipeline. It constructs a request URL using the provided arXiv ID, queries the Semantic Scholar Graph API for the citationCount field, and returns the integer citation count. The function is designed for resilience: any exception -- network timeout, HTTP error, JSON parsing failure, or missing field -- is caught, logged as a warning, and results in a return value of 0 rather than a pipeline crash.
The function is called iteratively for each arXiv paper tracked by the repository. The individual citation counts are then summed to produce a total that is written into the badge JSON file.
Usage
This function is called within the update_arxiv_citations.py script during the CI workflow. It is invoked once per tracked arXiv paper ID. The script iterates over a list of known paper IDs, calls this function for each, and aggregates the results.
Code Reference
Source Location
| File | .github/scripts/update_arxiv_citations.py
|
| Lines | L4-17 |
| Repository | testtimescaling.github.io |
Signature
def fetch_arxiv_citation_count(arxiv_id: str) -> int:
"""
从 Semantic Scholar API 获取指定 arXiv ID 对应的引用数 (citationCount)。
如果找不到或接口异常,则返回 0。
"""
url = f"https://api.semanticscholar.org/graph/v1/paper/ARXIV:{arxiv_id}?fields=citationCount"
try:
r = requests.get(url, timeout=10)
r.raise_for_status()
data = r.json()
return data.get("citationCount", 0)
except Exception as e:
print(f"[Warning] Failed to fetch citation for ArXiv:{arxiv_id} - {e}")
return 0
Import
from update_arxiv_citations import fetch_arxiv_citation_count
Or invoked as part of the full script:
python .github/scripts/update_arxiv_citations.py
External Dependencies
| Dependency | Type | Description |
|---|---|---|
requests |
Python library | HTTP client library for making GET requests |
| Semantic Scholar Graph API | External service | https://api.semanticscholar.org/graph/v1/paper/ -- provides citation metadata for academic papers
|
I/O Contract
Inputs
| Input | Type | Description | Example |
|---|---|---|---|
arxiv_id |
str |
The arXiv identifier of the paper (without the ARXIV: prefix) |
"2503.24235"
|
Outputs
| Output | Type | Description | Example |
|---|---|---|---|
| Citation count | int |
The number of citations for the paper. Returns 0 if the paper is not found or if any error occurs. |
42
|
Usage Examples
Example 1: Basic usage
count = fetch_arxiv_citation_count("2503.24235")
print(f"Citations: {count}")
# Output: Citations: 42
Example 2: Handling a non-existent paper
count = fetch_arxiv_citation_count("0000.00000")
# [Warning] Failed to fetch citation for ArXiv:0000.00000 - 404 Client Error: Not Found
print(f"Citations: {count}")
# Output: Citations: 0
Example 3: Aggregating citations for multiple papers
arxiv_ids = ["2503.24235", "2501.12345", "2502.67890"]
total = sum(fetch_arxiv_citation_count(aid) for aid in arxiv_ids)
print(f"Total citations: {total}")
Related Pages
- Principle:Testtimescaling_Testtimescaling_github_io_API_Data_Fetching
- Environment:Testtimescaling_Testtimescaling_github_io_GitHub_Actions_Runner
- Environment:Testtimescaling_Testtimescaling_github_io_Python_3_Runtime
- Environment:Testtimescaling_Testtimescaling_github_io_Semantic_Scholar_API
- Heuristic:Testtimescaling_Testtimescaling_github_io_Hardcoded_IDs_vs_Registry
- Implementation:Testtimescaling_Testtimescaling_github_io_Json_Dump_Shields_Badge -- Consumes the citation count output to generate badge JSON
- Implementation:Testtimescaling_Testtimescaling_github_io_Actions_Checkout_V3 -- Must run before this script is available on disk