File size: 3,443 Bytes
cebcaaa
0a92a68
 
cebcaaa
 
 
 
 
 
1969b37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
title: Studyrag
emoji: πŸƒ
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---

# Studyson β€” RAG Document QA & Summarization

A full-stack Retrieval-Augmented Generation (RAG) app for document Q&A, conversational chat, and summarization. Built with FastAPI, LlamaIndex, Groq, and a persistent Chroma vector store.

## Features

- **Multi-format ingestion** β€” PDF, DOCX, TXT, and Markdown files
- **Web scraping** β€” Index any HTML page (with timeout, size cap, and content-type guard)
- **Conversational chat** β€” Multi-turn Q&A with per-session memory
- **Persistent vector store** β€” Chroma on disk; index survives restarts
- **Smart summarization** β€” Length-controlled summaries across all indexed documents
- **Source citations** β€” Verifiable snippets with similarity scores
- **Real-time streaming** β€” Token-by-token Server-Sent Events
- **Markdown rendering** β€” Chat answers render with code blocks, lists, and headings

## Tech Stack

| Layer | Library |
|-------|---------|
| Web framework | FastAPI `>=0.118` |
| RAG orchestration | LlamaIndex `>=0.14` |
| LLM | Groq `llama-3.3-70b-versatile` |
| Embeddings | FastEmbed `BAAI/bge-small-en-v1.5` |
| Vector store | Chroma `>=0.6` (persistent) |
| Document parsing | PyMuPDF Β· pypdf Β· python-docx |
| HTTP client | httpx (async, with timeouts) |
| Frontend | Vanilla JS + marked + DOMPurify |

## API Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET`  | `/` | Web UI |
| `POST` | `/upload` | Upload PDF, DOCX, TXT, or MD |
| `POST` | `/scrape_and_index` | Scrape and index a URL |
| `POST` | `/stream_query` | SSE streaming Q&A (per-session chat memory) |
| `POST` | `/query` | One-shot Q&A with source citations |
| `POST` | `/summarize` | Summarize all indexed content |
| `POST` | `/reset` | Drop the index and clear all sessions |
| `GET`  | `/status` | System status, indexed docs, active model |

## Configuration

Set via HF Space secrets or a `.env` file locally:

| Variable | Default | Purpose |
|----------|---------|---------|
| `GROQ_API_KEY` | *(required)* | Groq API key |
| `GROQ_MODEL` | `llama-3.3-70b-versatile` | Groq chat model |
| `EMBED_MODEL` | `BAAI/bge-small-en-v1.5` | Embedding model |
| `MAX_FILE_SIZE` | `20971520` (20 MB) | Upload size limit |
| `MAX_SCRAPE_BYTES` | `5242880` (5 MB) | Scrape body cap |
| `SIMILARITY_TOP_K` | `4` | Retrieval top-k |

## Local Development

```bash
git clone <repo-url>
cd studyrag
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # add your GROQ_API_KEY
uvicorn app.main:app --reload --port 7860
```

## Docker

```bash
docker compose up --build
```

Volumes persist `uploads/`, `chroma_store/`, and the FastEmbed model cache across restarts.

## Deploying on Hugging Face Spaces

1. Push this repo to GitHub
2. Go to [huggingface.co](https://huggingface.co) β†’ your profile β†’ **New Space**
3. Select **Docker** SDK, link your GitHub repo
4. Add `GROQ_API_KEY` under **Settings β†’ Variables and secrets**
5. The Space auto-builds and serves on port 7860

> **Note:** The Chroma store and uploads persist within the Space filesystem but are wiped on a factory reset.

## Acknowledgments

- [LlamaIndex](https://www.llamaindex.ai/)
- [Groq](https://groq.com/)
- [Chroma](https://www.trychroma.com/)
- [FastEmbed](https://github.com/qdrant/fastembed)
- [FastAPI](https://fastapi.tiangolo.com/)