scholar_api / README.md
csuhan's picture
add
50e90ab
metadata
title: Google Scholar Citation API
emoji: πŸ“š
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false

Google Scholar Citation API

A lightweight API that fetches Google Scholar author citation metrics via SerpAPI, with 24-hour caching to minimize API calls.

Features

  • Citation Metrics: Retrieve h-index, i10-index, total citations, citation graph, and article list for any Google Scholar author.
  • Daily Cache: Each author's data is cached for 24 hours β€” at most 1 SerpAPI call per author per day, no matter how many requests hit the API.
  • Fast & Lightweight: Built with FastAPI + uvicorn, deployed as a Docker-based HuggingFace Space.

API Endpoints

Endpoint Method Description
/ GET Welcome message & usage hint
/citations?author_id=XXX GET Get citation data for an author
/cache/status?author_id=XXX GET Check cache freshness for an author
/health GET Health check
/docs GET Interactive Swagger UI

Setup

1. Create a HuggingFace Space

  1. Go to huggingface.co/new-space
  2. Choose Docker as the SDK
  3. Push this repo to the Space

2. Set the SerpAPI Key

Go to Space Settings β†’ Secrets and add:

Secret Name Value
SERPAPI_KEY Your SerpAPI API key

3. Usage Example

# Replace YOUR_SPACE_URL and AUTHOR_ID
curl "https://YOUR-USERNAME-YOUR-SPACE.hf.space/citations?author_id=JicYPdAAAAAJ"

Python Example

import requests

resp = requests.get(
    "https://YOUR-USERNAME-YOUR-SPACE.hf.space/citations",
    params={"author_id": "JicYPdAAAAAJ"},
)
data = resp.json()
print(data["citation_stats"]["table"])  # h-index, i10-index, citations
print(data["author"]["name"])

How Caching Works

Request β†’ Is there a cache file for this author_id?
  β”œβ”€ YES & age < 24h β†’ return cached data  (_source: "cache")
  └─ NO or expired   β†’ call SerpAPI β†’ save to cache β†’ return  (_source: "serpapi")

Cache is stored as JSON files under /tmp/scholar_cache/ inside the container.

Response Schema

{
  "author": {
    "name": "...",
    "affiliations": "...",
    "thumbnail": "...",
    "interests": [{"title": "...", "link": "..."}]
  },
  "citation_stats": {
    "table": [
      {"citations": {"all": 12345, "since_2021": 6789}},
      {"h_index": {"all": 50, "since_2021": 30}},
      {"i10_index": {"all": 100, "since_2021": 60}}
    ],
    "graph": [{"year": 2020, "citations": 500}, ...]
  },
  "articles": [
    {
      "title": "...",
      "link": "...",
      "authors": "...",
      "publication": "...",
      "cited_by_value": 123,
      "year": "2023"
    }
  ],
  "_source": "cache | serpapi",
  "_cached_at": 1700000000,
  "_cached_at_human": "2025-01-01T00:00:00+00:00"
}

Local Development

export SERPAPI_KEY="your_key_here"
pip install -r requirements.txt
uvicorn app:app --reload --port 7860

License

MIT