ragsystem__ / README.md
Rakib023's picture
Update README.md
f707f57 verified
---
title: PDF Q&A (Gemini RAG)
emoji: 🧠
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: "5.49.1"
app_file: app.py
pinned: false
---
# PDF Q&A (RAG) with Gemini 2.5 Flash β€” Hugging Face Space
This Space lets you upload PDFs and ask questions. It uses:
- **LangChain** text splitters (document-specific splitting for Markdown/Python/JS, plus a generic recursive splitter).
- **FAISS** for vector search.
- **Gemini** for **embeddings** (`text-embedding-004`) and **generation** (`gemini-2.5-flash`).
## Quick start (on Hugging Face)
1. Create a new **Space** β†’ **Gradio (Python)**.
2. Add these files: `app.py`, `requirements.txt`, and this `README.md`.
3. In the Space, go to **Settings β†’ Variables and secrets** and add:
- Key: `GEMINI_API_KEY`
- Value: *your Gemini API key* (do **not** commit it in code).
4. Click **Restart** to build & launch the Space.
## Local dev
```bash
python -m venv .venv
source .venv/bin/activate # on Windows: .venv\Scripts\activate
pip install -r requirements.txt
export GEMINI_API_KEY="YOUR_KEY_HERE"
python app.py
```
## Notes
- The app will try the new `from google import genai` client first, then fall back to the legacy `google-generativeai` package.
- The document splitting logic is heuristic-based:
- Markdown style content β†’ `MarkdownTextSplitter`
- Python-like content β†’ `PythonCodeTextSplitter`
- JavaScript-like content β†’ `RecursiveCharacterTextSplitter.from_language(Language.JS, ...)`
- Otherwise β†’ `RecursiveCharacterTextSplitter`
- If an answer is not in the context, the model is instructed to say it doesn't know.