PapersRAG-1.5B 🧪

A retrieval-augmented generation system for querying recent scientific literature — continuously updated.

PapersRAG-1.5B helps researchers explore and answer questions across a growing corpus of recent NLP papers from arXiv. It pairs a lightweight language model with a curated knowledge base of paper abstracts and a retrieval pipeline that prioritizes faithful, citation-backed answers over hallucination.

The model is automatically refreshed every day with the latest cs.CL papers. The knowledge base expands on its own. No manual upkeep required.


Model description

  • Type: Retrieval-augmented generation (RAG)
  • Base language model: Qwen 2.5 1.5B — small, fast, coherent when grounded with good context
  • Knowledge base: A continuously growing collection of abstracts from the most recent cs.CL papers on arXiv, updated daily via an automated pipeline
  • Retrieval pipeline: Dense embeddings for initial candidate retrieval, cross-encoder for re-ranking — only the most relevant chunks reach the language model
  • Answer style: Every answer cites the paper title it draws from. If no relevant paper is found, the model says so instead of fabricating one

Intended use

PapersRAG is a research assistant. It helps scientists and students locate information within indexed NLP papers, ask comparative questions like "What are the latest trends in retrieval-augmented generation?", and surface specific details about a paper's methodology or findings.

It is not a general-purpose chatbot. It does not have access to full paper text. It only knows what has been explicitly indexed. It will tell you when it doesn't know something.


How it works

  1. Indexing — Paper abstracts are split into overlapping chunks, embedded with a dense bi-encoder, and stored in a FAISS index
  2. Retrieval — The bi-encoder fetches a pool of candidate chunks for any given question
  3. Re-ranking — A cross-encoder scores each candidate; only chunks above a confidence threshold are kept
  4. Generation — Retained chunks are passed as context to the 1.5B model, which generates a cited answer
  5. Safety — If nothing clears the confidence threshold, the model refuses to answer rather than hallucinate

No relevant chunk, no answer. That's the rule.


Automated daily updates

Every day, the update pipeline:

  • Downloads the existing index and chunk store from this repository
  • Scrapes the 100 most recent papers from cs.CL on arXiv
  • Chunks, embeds, and appends the new papers to the existing knowledge base
  • Rebuilds the FAISS index and uploads everything back

The knowledge base grows by roughly 100 papers per day, automatically.


Quick start

from huggingface_hub import snapshot_download
from pipeline import PapersRAG

model_dir = snapshot_download("metaresearch/PapersRAG-1.5B")

rag = PapersRAG(model_dir)

print(rag.ask("What are the latest approaches to retrieval-augmented generation?"))

Requires transformers, sentence-transformers, and faiss. Everything else is in pipeline.py.


Model composition

Component Description
Language Model Qwen 2.5 1.5B (float16)
Bi-encoder Dense embedding model for initial retrieval
Cross-encoder Re-ranking model that scores chunks for relevance
Vector Index FAISS index of embedded paper chunks
Knowledge Chunks Processed snippets from indexed arXiv abstracts
Pipeline pipeline.py — one class, handles loading, retrieval, and generation

Exact model names for the bi-encoder and cross-encoder are in the repository's configuration files.


Limitations

Knowledge base scope. Only cs.CL papers from arXiv. Papers from other fields are not included unless manually added.

Abstracts only. Full paper text is not indexed. Deep methodological comparisons may be incomplete.

Small language model. 1.5B parameters is lightweight. The retrieval pipeline handles factual accuracy well, but nuanced multi-paper synthesis has limits.

English only.


License

Apache-2.0.


PapersRAG is part of the Meta Research initiative — building open tools that accelerate scientific discovery.

Downloads last month
368
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for metaresearch/PapersRAG-1.5B

Finetuned
(352)
this model
Quantizations
1 model