Implementation:ThreeSR Awesome Inference Time Scaling Get Paper Info Function
| Page Metadata | |
|---|---|
| Knowledge Sources | Awesome-Inference-Time-Scaling |
| Domains | Metadata Enrichment, API Integration, Scholarly Literature Discovery |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for fetching extended paper metadata from the Semantic Scholar API by paper ID, provided by the get_paper_info() function in fetch_semantic_info.py.
Description
The get_paper_info() function takes a Semantic Scholar paper ID (obtained from search results) and queries the Semantic Scholar v1 paper detail endpoint to retrieve extended metadata. This enriches the basic search result with additional fields -- most importantly the arxivId (used to construct arXiv links) and the abstract (displayed in the curated paper list).
Important note: This function uses the older v1 endpoint (https://api.semanticscholar.org/v1/paper/{paper_id}), not the graph/v1 endpoint used by search_papers(). The v1 endpoint returns a different response schema with fields like arxivId at the top level.
Usage
Import or call this function when:
- You have a paper ID from a search result and need the arXiv identifier to build PDF links
- You need the full abstract of a paper for display purposes
- You are enriching search results before formatting them into the README
Code Reference
Source Location: fetch_semantic_info.py, lines 43-46
Function Signature:
def get_paper_info(paper_id: str) -> dict
Import Statement:
from fetch_semantic_info import get_paper_info
Source Code:
def get_paper_info(paper_id):
url = f'https://api.semanticscholar.org/v1/paper/{paper_id}'
response = requests.get(url)
return response.json()
Dependencies:
requests(third-party HTTP library)
API Endpoint:
GET https://api.semanticscholar.org/v1/paper/{paper_id}
Note: This is the v1 endpoint, distinct from the graph/v1 endpoint used by search_papers().
I/O Contract
Inputs:
| Parameter | Type | Default | Description |
|---|---|---|---|
paper_id |
str |
(required) | The Semantic Scholar unique paper identifier, typically obtained from the paperId field in search results
|
Outputs:
| Field | Type | Description |
|---|---|---|
arxivId |
str or None |
The arXiv identifier (e.g., "2501.12345"); used to construct arXiv abs and PDF URLs
|
abstract |
str or None |
The full text of the paper's abstract |
title |
str |
The paper title |
authors |
list[dict] |
List of author objects with name and authorId fields
|
venue |
str |
The publication venue |
year |
int |
The publication year |
citationCount |
int |
Number of citations (available in v1 response) |
references |
list[dict] |
List of referenced papers (available in v1 response) |
The function returns a dict containing the full v1 API response. The fields most commonly used by this repository are arxivId and abstract.
Usage Examples
Example 1: Basic detail retrieval
from fetch_semantic_info import get_paper_info
# Using a Semantic Scholar paper ID from search results
paper_id = "649def34f8be52c8b66281af98ae884c09aef38b"
info = get_paper_info(paper_id)
print(f"Title: {info.get('title')}")
print(f"arXiv ID: {info.get('arxivId')}")
print(f"Abstract: {info.get('abstract', 'No abstract available.')[:200]}...")
Example 2: Constructing arXiv links from the result
from fetch_semantic_info import get_paper_info
paper_id = "649def34f8be52c8b66281af98ae884c09aef38b"
info = get_paper_info(paper_id)
arxiv_id = info.get("arxivId", "N/A")
if arxiv_id and arxiv_id != "N/A":
abs_url = f"https://arxiv.org/abs/{arxiv_id}"
pdf_url = f"https://arxiv.org/pdf/{arxiv_id}"
print(f"Abstract page: {abs_url}")
print(f"PDF download: {pdf_url}")
else:
print("No arXiv ID available for this paper.")
Example 3: Full pipeline -- search then enrich
from fetch_semantic_info import search_papers, get_paper_info
# Step 1: Search for papers
papers = search_papers("Inference-Time Scaling", limit=3)
# Step 2: Enrich each result with detail info
for paper in papers:
paper_id = paper["paperId"]
detail = get_paper_info(paper_id)
title = paper["title"]
arxiv_id = detail.get("arxivId", "N/A")
abstract = detail.get("abstract", "No abstract available.")
print(f"Title: {title}")
print(f"arXiv: {arxiv_id}")
print(f"Abstract: {abstract[:150]}...")
print()
Example 4: Handling missing fields
from fetch_semantic_info import get_paper_info
info = get_paper_info("some-paper-id")
# Safely extract with defaults (as done in format_paper_info)
arxiv_id = info.get("arxivId", "N/A")
abstract = info.get("abstract", "No abstract available.")
print(f"arXiv ID: {arxiv_id}")
print(f"Abstract: {abstract}")
Related Pages
- Principle:ThreeSR_Awesome_Inference_Time_Scaling_Paper_Detail_Retrieval -- The principle of Paper Detail Retrieval that this function implements
- Environment:ThreeSR_Awesome_Inference_Time_Scaling_Python_Runtime_Environment
- Environment:ThreeSR_Awesome_Inference_Time_Scaling_Semantic_Scholar_API_Environment
- Heuristic:ThreeSR_Awesome_Inference_Time_Scaling_API_Rate_Limiting_Tip