| --- |
| title: Studyrag |
| emoji: π |
| colorFrom: blue |
| colorTo: green |
| sdk: docker |
| pinned: false |
| --- |
| |
| # Studyson β RAG Document QA & Summarization |
|
|
| A full-stack Retrieval-Augmented Generation (RAG) app for document Q&A, conversational chat, and summarization. Built with FastAPI, LlamaIndex, Groq, and a persistent Chroma vector store. |
|
|
| ## Features |
|
|
| - **Multi-format ingestion** β PDF, DOCX, TXT, and Markdown files |
| - **Web scraping** β Index any HTML page (with timeout, size cap, and content-type guard) |
| - **Conversational chat** β Multi-turn Q&A with per-session memory |
| - **Persistent vector store** β Chroma on disk; index survives restarts |
| - **Smart summarization** β Length-controlled summaries across all indexed documents |
| - **Source citations** β Verifiable snippets with similarity scores |
| - **Real-time streaming** β Token-by-token Server-Sent Events |
| - **Markdown rendering** β Chat answers render with code blocks, lists, and headings |
|
|
| ## Tech Stack |
|
|
| | Layer | Library | |
| |-------|---------| |
| | Web framework | FastAPI `>=0.118` | |
| | RAG orchestration | LlamaIndex `>=0.14` | |
| | LLM | Groq `llama-3.3-70b-versatile` | |
| | Embeddings | FastEmbed `BAAI/bge-small-en-v1.5` | |
| | Vector store | Chroma `>=0.6` (persistent) | |
| | Document parsing | PyMuPDF Β· pypdf Β· python-docx | |
| | HTTP client | httpx (async, with timeouts) | |
| | Frontend | Vanilla JS + marked + DOMPurify | |
|
|
| ## API Endpoints |
|
|
| | Method | Endpoint | Description | |
| |--------|----------|-------------| |
| | `GET` | `/` | Web UI | |
| | `POST` | `/upload` | Upload PDF, DOCX, TXT, or MD | |
| | `POST` | `/scrape_and_index` | Scrape and index a URL | |
| | `POST` | `/stream_query` | SSE streaming Q&A (per-session chat memory) | |
| | `POST` | `/query` | One-shot Q&A with source citations | |
| | `POST` | `/summarize` | Summarize all indexed content | |
| | `POST` | `/reset` | Drop the index and clear all sessions | |
| | `GET` | `/status` | System status, indexed docs, active model | |
|
|
| ## Configuration |
|
|
| Set via HF Space secrets or a `.env` file locally: |
|
|
| | Variable | Default | Purpose | |
| |----------|---------|---------| |
| | `GROQ_API_KEY` | *(required)* | Groq API key | |
| | `GROQ_MODEL` | `llama-3.3-70b-versatile` | Groq chat model | |
| | `EMBED_MODEL` | `BAAI/bge-small-en-v1.5` | Embedding model | |
| | `MAX_FILE_SIZE` | `20971520` (20 MB) | Upload size limit | |
| | `MAX_SCRAPE_BYTES` | `5242880` (5 MB) | Scrape body cap | |
| | `SIMILARITY_TOP_K` | `4` | Retrieval top-k | |
|
|
| ## Local Development |
|
|
| ```bash |
| git clone <repo-url> |
| cd studyrag |
| python -m venv venv && source venv/bin/activate |
| pip install -r requirements.txt |
| cp .env.example .env # add your GROQ_API_KEY |
| uvicorn app.main:app --reload --port 7860 |
| ``` |
|
|
| ## Docker |
|
|
| ```bash |
| docker compose up --build |
| ``` |
|
|
| Volumes persist `uploads/`, `chroma_store/`, and the FastEmbed model cache across restarts. |
|
|
| ## Deploying on Hugging Face Spaces |
|
|
| 1. Push this repo to GitHub |
| 2. Go to [huggingface.co](https://huggingface.co) β your profile β **New Space** |
| 3. Select **Docker** SDK, link your GitHub repo |
| 4. Add `GROQ_API_KEY` under **Settings β Variables and secrets** |
| 5. The Space auto-builds and serves on port 7860 |
|
|
| > **Note:** The Chroma store and uploads persist within the Space filesystem but are wiped on a factory reset. |
|
|
| ## Acknowledgments |
|
|
| - [LlamaIndex](https://www.llamaindex.ai/) |
| - [Groq](https://groq.com/) |
| - [Chroma](https://www.trychroma.com/) |
| - [FastEmbed](https://github.com/qdrant/fastembed) |
| - [FastAPI](https://fastapi.tiangolo.com/) |
|
|