PapersRAG-1.5B 🧪

A retrieval-augmented generation system for querying recent scientific literature — continuously updated.

PapersRAG-1.5B helps researchers explore and answer questions across a growing corpus of recent NLP papers from arXiv. It pairs a lightweight language model with a curated knowledge base of paper abstracts and a retrieval pipeline that prioritizes faithful, citation-backed answers over hallucination.

The model is automatically refreshed every day with the latest cs.CL papers. The knowledge base expands on its own. No manual upkeep required.

Model description

Type: Retrieval-augmented generation (RAG)
Base language model: Qwen 2.5 1.5B — small, fast, coherent when grounded with good context
Knowledge base: A continuously growing collection of abstracts from the most recent cs.CL papers on arXiv, updated daily via an automated pipeline
Retrieval pipeline: Dense embeddings for initial candidate retrieval, cross-encoder for re-ranking — only the most relevant chunks reach the language model
Answer style: Every answer cites the paper title it draws from. If no relevant paper is found, the model says so instead of fabricating one

Intended use

PapersRAG is a research assistant. It helps scientists and students locate information within indexed NLP papers, ask comparative questions like "What are the latest trends in retrieval-augmented generation?", and surface specific details about a paper's methodology or findings.

It is not a general-purpose chatbot. It does not have access to full paper text. It only knows what has been explicitly indexed. It will tell you when it doesn't know something.

How it works

Indexing — Paper abstracts are split into overlapping chunks, embedded with a dense bi-encoder, and stored in a FAISS index
Retrieval — The bi-encoder fetches a pool of candidate chunks for any given question
Re-ranking — A cross-encoder scores each candidate; only chunks above a confidence threshold are kept
Generation — Retained chunks are passed as context to the 1.5B model, which generates a cited answer
Safety — If nothing clears the confidence threshold, the model refuses to answer rather than hallucinate

No relevant chunk, no answer. That's the rule.

Automated daily updates

Every day, the update pipeline:

Downloads the existing index and chunk store from this repository
Scrapes the 100 most recent papers from cs.CL on arXiv
Chunks, embeds, and appends the new papers to the existing knowledge base
Rebuilds the FAISS index and uploads everything back

The knowledge base grows by roughly 100 papers per day, automatically.

Quick start

from huggingface_hub import snapshot_download
from pipeline import PapersRAG

model_dir = snapshot_download("metaresearch/PapersRAG-1.5B")

rag = PapersRAG(model_dir)

print(rag.ask("What are the latest approaches to retrieval-augmented generation?"))

Requires transformers, sentence-transformers, and faiss. Everything else is in pipeline.py.

Model composition

Component	Description
Language Model	Qwen 2.5 1.5B (float16)
Bi-encoder	Dense embedding model for initial retrieval
Cross-encoder	Re-ranking model that scores chunks for relevance
Vector Index	FAISS index of embedded paper chunks
Knowledge Chunks	Processed snippets from indexed arXiv abstracts
Pipeline	`pipeline.py` — one class, handles loading, retrieval, and generation

Exact model names for the bi-encoder and cross-encoder are in the repository's configuration files.

Limitations

Knowledge base scope. Only cs.CL papers from arXiv. Papers from other fields are not included unless manually added.

Abstracts only. Full paper text is not indexed. Deep methodological comparisons may be incomplete.

Small language model. 1.5B parameters is lightweight. The retrieval pipeline handles factual accuracy well, but nuanced multi-paper synthesis has limits.

English only.

License

Apache-2.0.

PapersRAG is part of the Meta Research initiative — building open tools that accelerate scientific discovery.

Downloads last month: 368

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for metaresearch/PapersRAG-1.5B

Base model

Qwen/Qwen2.5-1.5B

Finetuned

(352)

this model

Quantizations

1 model