Spaces:
Sleeping
Sleeping
| title: RAG | |
| emoji: ⚡ | |
| colorFrom: red | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 6.1.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| # File: `README.md` | |
| ```markdown | |
| # RAG Document Assistant | |
| A complete Retrieval-Augmented Generation (RAG) Python project using: | |
| - HuggingFace Transformers | |
| - sentence-transformers (for embeddings) | |
| - ChromaDB (vector store) | |
| - Gradio (UI) | |
| ## Project Structure | |
| ``` | |
| project/ | |
| ├── app.py | |
| ├── rag_pipeline.py | |
| ├── generator.py | |
| ├── utils.py | |
| ├── requirements.txt | |
| ├── README.md | |
| ├── data/ | |
| └── db/ | |
| ```` | |
| ## Installation | |
| 1. Create a virtual environment (recommended): | |
| ```bash | |
| python -m venv venv | |
| source venv/bin/activate # on Windows use venv\Scripts\activate | |
| ```` | |
| 2. Install requirements: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. Run the app: | |
| ```bash | |
| python app.py | |
| ``` | |
| Open the Gradio URL shown in the console (default `http://127.0.0.1:7860`). | |
| ## How RAG Works (short) | |
| 1. Documents are uploaded and their text is extracted. | |
| 2. Text is chunked into overlapping passages. | |
| 3. Each chunk is embedded using a pretrained sentence-transformer. | |
| 4. Chunks and embeddings are stored in a vector database (ChromaDB). | |
| 5. At query time, the user question is embedded and used to retrieve most relevant chunks. | |
| 6. Retrieved chunks are passed to a generator LLM which composes a grounded answer. | |
| ## Notes & Troubleshooting | |
| * Textract may need system-level dependencies for PDF/DOCX parsing on some platforms. | |
| * Large models may require GPUs. For local CPU usage, prefer small models like `flan-t5-small`. | |
| * If you see memory errors, reduce model size or run on a machine with more RAM. | |
| ## Roadmap / Improvements | |
| * Add user authentication and per-user collections | |
| * Support incremental indexing and deletion | |
| * Add streaming generation for long answers | |
| * Add API endpoints via FastAPI | |