RAG / README.md
oluinioluwa814's picture
Update README.md
f4614fc verified

A newer version of the Gradio SDK is available: 6.9.0

Upgrade
metadata
title: RAG
emoji: 
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 6.1.0
app_file: app.py
pinned: false
license: apache-2.0

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

File: README.md

# RAG Document Assistant

A complete Retrieval-Augmented Generation (RAG) Python project using:
- HuggingFace Transformers
- sentence-transformers (for embeddings)
- ChromaDB (vector store)
- Gradio (UI)

## Project Structure

project/ ├── app.py ├── rag_pipeline.py ├── generator.py ├── utils.py ├── requirements.txt ├── README.md ├── data/ └── db/


## Installation
1. Create a virtual environment (recommended):

```bash
python -m venv venv
source venv/bin/activate  # on Windows use venv\Scripts\activate
  1. Install requirements:
pip install -r requirements.txt
  1. Run the app:
python app.py

Open the Gradio URL shown in the console (default http://127.0.0.1:7860).

How RAG Works (short)

  1. Documents are uploaded and their text is extracted.
  2. Text is chunked into overlapping passages.
  3. Each chunk is embedded using a pretrained sentence-transformer.
  4. Chunks and embeddings are stored in a vector database (ChromaDB).
  5. At query time, the user question is embedded and used to retrieve most relevant chunks.
  6. Retrieved chunks are passed to a generator LLM which composes a grounded answer.

Notes & Troubleshooting

  • Textract may need system-level dependencies for PDF/DOCX parsing on some platforms.
  • Large models may require GPUs. For local CPU usage, prefer small models like flan-t5-small.
  • If you see memory errors, reduce model size or run on a machine with more RAM.

Roadmap / Improvements

  • Add user authentication and per-user collections
  • Support incremental indexing and deletion
  • Add streaming generation for long answers
  • Add API endpoints via FastAPI