scholar_api / README.md
csuhan's picture
add
50e90ab
---
title: Google Scholar Citation API
emoji: πŸ“š
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# Google Scholar Citation API
A lightweight API that fetches **Google Scholar author citation metrics** via [SerpAPI](https://serpapi.com/), with **24-hour caching** to minimize API calls.
## Features
- **Citation Metrics**: Retrieve h-index, i10-index, total citations, citation graph, and article list for any Google Scholar author.
- **Daily Cache**: Each author's data is cached for 24 hours β€” at most 1 SerpAPI call per author per day, no matter how many requests hit the API.
- **Fast & Lightweight**: Built with FastAPI + uvicorn, deployed as a Docker-based HuggingFace Space.
## API Endpoints
| Endpoint | Method | Description |
|---|---|---|
| `/` | GET | Welcome message & usage hint |
| `/citations?author_id=XXX` | GET | Get citation data for an author |
| `/cache/status?author_id=XXX` | GET | Check cache freshness for an author |
| `/health` | GET | Health check |
| `/docs` | GET | Interactive Swagger UI |
## Setup
### 1. Create a HuggingFace Space
1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Choose **Docker** as the SDK
3. Push this repo to the Space
### 2. Set the SerpAPI Key
Go to **Space Settings β†’ Secrets** and add:
| Secret Name | Value |
|---|---|
| `SERPAPI_KEY` | Your SerpAPI API key |
### 3. Usage Example
```bash
# Replace YOUR_SPACE_URL and AUTHOR_ID
curl "https://YOUR-USERNAME-YOUR-SPACE.hf.space/citations?author_id=JicYPdAAAAAJ"
```
### Python Example
```python
import requests
resp = requests.get(
"https://YOUR-USERNAME-YOUR-SPACE.hf.space/citations",
params={"author_id": "JicYPdAAAAAJ"},
)
data = resp.json()
print(data["citation_stats"]["table"]) # h-index, i10-index, citations
print(data["author"]["name"])
```
## How Caching Works
```
Request β†’ Is there a cache file for this author_id?
β”œβ”€ YES & age < 24h β†’ return cached data (_source: "cache")
└─ NO or expired β†’ call SerpAPI β†’ save to cache β†’ return (_source: "serpapi")
```
Cache is stored as JSON files under `/tmp/scholar_cache/` inside the container.
## Response Schema
```json
{
"author": {
"name": "...",
"affiliations": "...",
"thumbnail": "...",
"interests": [{"title": "...", "link": "..."}]
},
"citation_stats": {
"table": [
{"citations": {"all": 12345, "since_2021": 6789}},
{"h_index": {"all": 50, "since_2021": 30}},
{"i10_index": {"all": 100, "since_2021": 60}}
],
"graph": [{"year": 2020, "citations": 500}, ...]
},
"articles": [
{
"title": "...",
"link": "...",
"authors": "...",
"publication": "...",
"cited_by_value": 123,
"year": "2023"
}
],
"_source": "cache | serpapi",
"_cached_at": 1700000000,
"_cached_at_human": "2025-01-01T00:00:00+00:00"
}
```
## Local Development
```bash
export SERPAPI_KEY="your_key_here"
pip install -r requirements.txt
uvicorn app:app --reload --port 7860
```
## License
MIT