Spaces:

Manavraj
/

gemini_rag_api

Sleeping

App Files Files Community

gemini_rag_api / README.md

Manavraj

Update README.md

926132f verified about 2 months ago

preview code

raw

history blame contribute delete

5.44 kB

	---
	title: Gemini RAG Q&A API
	emoji: 🤖
	colorFrom: blue
	colorTo: green
	sdk: docker
	app_port: 8000
	---

	# 🤖 RAG Q&A API - Intelligent Document Query System

	> A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.

	<div style="display: flex; gap: 8px;">
	<a href="https://manavraj-gemini-rag-api.hf.space/docs" target="_blank">
	<img src="https://img.shields.io/badge/API-Try%20it%20Live-green?style=for-the-badge&logo=fastapi" alt="Try the Live API">
	</a>
	<a href="https://github.com/Manavraj-0/gemini_rag_api" target="_blank">
	<img src="https://img.shields.io/badge/Code-View%20on%20GitHub-blue?style=for-the-badge&logo=github" alt="View on GitHub">
	</a>
	</div>

	---

	## 🎯 Overview

	This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.

	### What is RAG?

	RAG (Retrieval-Augmented Generation) combines:
	1. Retrieval: Finding relevant document chunks using semantic search
	2. Augmentation: Adding retrieved context to the query
	3. Generation: Creating accurate, source-backed answers

	---

	## ✨ Key Features

	- 🧠 Semantic Search: FAISS vector database for intelligent context retrieval
	- ⚡ Fast Responses: Optimized pipeline with <4s average response time
	- 🌐 FastAPI: Clean API with automatic interactive documentation
	- 🐳 Docker Ready: One-command deployment

	---

	## 🛠️ Technology Stack

	- LLM: Google Gemini 2.5 Flash
	- Embeddings: Google `gemini-embedding-001`
	- Vector DB: FAISS (CPU)
	- Framework: LangChain (LCEL)
	- API: FastAPI + Uvicorn
	- Deployment: Docker + Hugging Face Spaces

	---

	## 🚀 Quick Start

	### Prerequisites
	- Python 3.10+
	- Google API Key ([Get one here - Google AI Studio](https://aistudio.google.com/))

	### Installation

	```bash
	# Clone the repository
	git clone https://github.com/Manavraj-0/gemini_rag_api.git
	cd gemini-rag-api

	# Install dependencies
	pip install -r requirements.txt

	# Set up environment variables
	echo 'GEMINI_API_KEY="your-api-key-here"' > .env

	# Create the knowledge base
	python ingest.py

	# Run the API
	uvicorn main:app --reload
	```

	### Using Docker

	```bash
	docker build -t gemini-rag-api .
	docker run -p 8000:8000 gemini-rag-api
	```

	---

	## 📖 API Usage

	### Interactive Documentation
	Once running, visit: http://localhost:8000/docs

	### Example Request

	Endpoint: `POST /ask`

	```bash
	curl -X POST "http://localhost:8000/ask" \
	-H "Content-Type: application/json" \
	-d '{
	"question": "What is this document about?"
	}'
	```

	Response:
	```json
	{
	"question": "What is this document about?",
	"answer": "This document discusses...",
	"source_documents": [
	"Original text chunk 1...",
	"Original text chunk 2..."
	]
	}
	```

	### Available Endpoints

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| GET \| `/` \| Welcome message \|
	\| POST \| `/ask` \| Submit a question and get an answer \|
	\| GET \| `/docs` \| Interactive API documentation \|

	---

	## 📁 Project Structure

	```
	rag_project/
	├── main.py # FastAPI application & RAG chain
	├── ingest.py # Document processing & indexing
	├── data.txt # Your knowledge base document (change content to explore)
	├── requirements.txt # Python dependencies
	├── Dockerfile # Container configuration
	├── .env # API keys (not committed)
	└── faiss_index/ # Vector database (generated)
	```

	---

	## 🔧 Configuration

	### Customize Retrieval
	In `main.py`, adjust the retriever:
	```python
	retriever = db.as_retriever(search_kwargs={"k": 3}) # Return top 3 results
	```

	### Adjust Model Temperature
	```python
	llm = ChatGoogleGenerativeAI(
	model="gemini-2.5-flash",
	temperature=0.1, # Lower = more focused, Higher = more creative
	)
	```

	### Change Chunk Size
	In `ingest.py`:
	```python
	text_splitter = RecursiveCharacterTextSplitter(
	chunk_size=1000, # Characters per chunk
	chunk_overlap=100 # Overlap between chunks
	)
	```

	---

	## 📊 Performance

	- Average Response Time: <4 seconds
	- Embedding Model: 768-dimensional vectors
	- Vector Search: FAISS L2 similarity
	- Chunk Strategy: 1000 chars with 100 char overlap

	---

	## 🤝 Skills Demonstrated

	This project showcases:
	- ✅ Generative AI: LLM integration and prompt engineering
	- ✅ Vector Databases: Semantic search with FAISS
	- ✅ API Development: RESTful design with FastAPI
	- ✅ ML Engineering: Data preprocessing and pipeline optimization
	- ✅ DevOps: Containerization and cloud deployment
	- ✅ Best Practices: Code structure, documentation, version control

	---

	## 🐛 Troubleshooting

	Issue: `API key not found`
	- Solution: Ensure `.env` file exists with `GEMINI_API_KEY="your-key"`

	Issue: `faiss_index not found`
	- Solution: Run `python ingest.py` first to create the index

	Issue: `Module not found`
	- Solution: Install all dependencies: `pip install -r requirements.txt`

	---

	## 👤 Contact

	- GitHub: [@Manavraj-0](https://github.com/Manavraj-0)
	- LinkedIn: [Manav Rajvansh](https://linkedin.com/in/meet-manav-rajvansh)