Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Mbzuai oryx Awesome LLM Post training Get Paper Count

From Leeroopedia


Knowledge Sources
Domains Bibliometrics, Trend_Analysis
Last Updated 2026-02-08 07:30 GMT

Overview

Concrete tool for querying yearly publication counts from the Semantic Scholar API for research trend analysis.

Description

The get_paper_count function queries the Semantic Scholar /paper/search endpoint with a keyword and year filter, requesting only 1 result (limit=1) to minimize data transfer while extracting the total count from the response. It includes aggressive retry logic (up to 10 retries) for HTTP 429 rate-limit responses with a 10-second sleep between attempts. A custom User-Agent header is set to identify the request as academic research.

Usage

Call this function for each keyword-year combination in the trend analysis loop. It is called within a nested loop: outer loop over keywords (from CSV), inner loop over years (2020-2025). A 1-second delay between calls is applied externally to be polite to the API.

Code Reference

Source Location

Signature

def get_paper_count(query: str, year: int) -> int:
    """
    Get number of papers for a given query and year from Semantic Scholar.

    Args:
        query: Research keyword to search for.
        year: Publication year filter.

    Returns:
        int: Total number of papers matching the query for that year.
        Returns 0 on error or retry exhaustion.
    """

Import

# Function defined in scripts/future_research_data.py
# Dependencies:
import requests
import time

I/O Contract

Inputs

Name Type Required Description
query str Yes Research keyword to search for
year int Yes Publication year filter (e.g., 2023)

Outputs

Name Type Description
return value int Total number of papers matching query for the specified year. Returns 0 on error.

Usage Examples

Single Query

# Get paper count for a specific keyword and year
count = get_paper_count("reinforcement learning from human feedback", 2023)
print(f"RLHF papers in 2023: {count}")

Full Trend Analysis Loop

import time

keywords = ["RLHF", "Direct Preference Optimization", "MCTS for LLM"]
years = list(range(2020, 2026))

for keyword in keywords:
    counts = []
    for year in years:
        count = get_paper_count(keyword, year)
        counts.append(count)
        time.sleep(1)  # Polite delay between requests
    print(f"{keyword}: {dict(zip(years, counts))}")

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment