metadata
title: Studyrag
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
Studyson β RAG Document QA & Summarization
A full-stack Retrieval-Augmented Generation (RAG) app for document Q&A, conversational chat, and summarization. Built with FastAPI, LlamaIndex, Groq, and a persistent Chroma vector store.
Features
- Multi-format ingestion β PDF, DOCX, TXT, and Markdown files
- Web scraping β Index any HTML page (with timeout, size cap, and content-type guard)
- Conversational chat β Multi-turn Q&A with per-session memory
- Persistent vector store β Chroma on disk; index survives restarts
- Smart summarization β Length-controlled summaries across all indexed documents
- Source citations β Verifiable snippets with similarity scores
- Real-time streaming β Token-by-token Server-Sent Events
- Markdown rendering β Chat answers render with code blocks, lists, and headings
Tech Stack
| Layer | Library |
|---|---|
| Web framework | FastAPI >=0.118 |
| RAG orchestration | LlamaIndex >=0.14 |
| LLM | Groq llama-3.3-70b-versatile |
| Embeddings | FastEmbed BAAI/bge-small-en-v1.5 |
| Vector store | Chroma >=0.6 (persistent) |
| Document parsing | PyMuPDF Β· pypdf Β· python-docx |
| HTTP client | httpx (async, with timeouts) |
| Frontend | Vanilla JS + marked + DOMPurify |
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Web UI |
POST |
/upload |
Upload PDF, DOCX, TXT, or MD |
POST |
/scrape_and_index |
Scrape and index a URL |
POST |
/stream_query |
SSE streaming Q&A (per-session chat memory) |
POST |
/query |
One-shot Q&A with source citations |
POST |
/summarize |
Summarize all indexed content |
POST |
/reset |
Drop the index and clear all sessions |
GET |
/status |
System status, indexed docs, active model |
Configuration
Set via HF Space secrets or a .env file locally:
| Variable | Default | Purpose |
|---|---|---|
GROQ_API_KEY |
(required) | Groq API key |
GROQ_MODEL |
llama-3.3-70b-versatile |
Groq chat model |
EMBED_MODEL |
BAAI/bge-small-en-v1.5 |
Embedding model |
MAX_FILE_SIZE |
20971520 (20 MB) |
Upload size limit |
MAX_SCRAPE_BYTES |
5242880 (5 MB) |
Scrape body cap |
SIMILARITY_TOP_K |
4 |
Retrieval top-k |
Local Development
git clone <repo-url>
cd studyrag
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # add your GROQ_API_KEY
uvicorn app.main:app --reload --port 7860
Docker
docker compose up --build
Volumes persist uploads/, chroma_store/, and the FastEmbed model cache across restarts.
Deploying on Hugging Face Spaces
- Push this repo to GitHub
- Go to huggingface.co β your profile β New Space
- Select Docker SDK, link your GitHub repo
- Add
GROQ_API_KEYunder Settings β Variables and secrets - The Space auto-builds and serves on port 7860
Note: The Chroma store and uploads persist within the Space filesystem but are wiped on a factory reset.