Spaces:
Sleeping
Sleeping
File size: 5,444 Bytes
4511fb2 4a0da40 4511fb2 a699672 2f218a2 926132f 2f218a2 a699672 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
---
title: Gemini RAG Q&A API
emoji: π€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
---
# π€ RAG Q&A API - Intelligent Document Query System
> A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.
<div style="display: flex; gap: 8px;">
<a href="https://manavraj-gemini-rag-api.hf.space/docs" target="_blank">
<img src="https://img.shields.io/badge/API-Try%20it%20Live-green?style=for-the-badge&logo=fastapi" alt="Try the Live API">
</a>
<a href="https://github.com/Manavraj-0/gemini_rag_api" target="_blank">
<img src="https://img.shields.io/badge/Code-View%20on%20GitHub-blue?style=for-the-badge&logo=github" alt="View on GitHub">
</a>
</div>
---
## π― Overview
This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.
### What is RAG?
RAG (Retrieval-Augmented Generation) combines:
1. **Retrieval**: Finding relevant document chunks using semantic search
2. **Augmentation**: Adding retrieved context to the query
3. **Generation**: Creating accurate, source-backed answers
---
## β¨ Key Features
- π§ **Semantic Search**: FAISS vector database for intelligent context retrieval
- β‘ **Fast Responses**: Optimized pipeline with <4s average response time
- π **FastAPI**: Clean API with automatic interactive documentation
- π³ **Docker Ready**: One-command deployment
---
## π οΈ Technology Stack
- **LLM**: Google Gemini 2.5 Flash
- **Embeddings**: Google `gemini-embedding-001`
- **Vector DB**: FAISS (CPU)
- **Framework**: LangChain (LCEL)
- **API**: FastAPI + Uvicorn
- **Deployment**: Docker + Hugging Face Spaces
---
## π Quick Start
### Prerequisites
- Python 3.10+
- Google API Key ([Get one here - Google AI Studio](https://aistudio.google.com/))
### Installation
```bash
# Clone the repository
git clone https://github.com/Manavraj-0/gemini_rag_api.git
cd gemini-rag-api
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
echo 'GEMINI_API_KEY="your-api-key-here"' > .env
# Create the knowledge base
python ingest.py
# Run the API
uvicorn main:app --reload
```
### Using Docker
```bash
docker build -t gemini-rag-api .
docker run -p 8000:8000 gemini-rag-api
```
---
## π API Usage
### Interactive Documentation
Once running, visit: **http://localhost:8000/docs**
### Example Request
**Endpoint**: `POST /ask`
```bash
curl -X POST "http://localhost:8000/ask" \
-H "Content-Type: application/json" \
-d '{
"question": "What is this document about?"
}'
```
**Response**:
```json
{
"question": "What is this document about?",
"answer": "This document discusses...",
"source_documents": [
"Original text chunk 1...",
"Original text chunk 2..."
]
}
```
### Available Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/` | Welcome message |
| POST | `/ask` | Submit a question and get an answer |
| GET | `/docs` | Interactive API documentation |
---
## π Project Structure
```
rag_project/
βββ main.py # FastAPI application & RAG chain
βββ ingest.py # Document processing & indexing
βββ data.txt # Your knowledge base document (change content to explore)
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ .env # API keys (not committed)
βββ faiss_index/ # Vector database (generated)
```
---
## π§ Configuration
### Customize Retrieval
In `main.py`, adjust the retriever:
```python
retriever = db.as_retriever(search_kwargs={"k": 3}) # Return top 3 results
```
### Adjust Model Temperature
```python
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
temperature=0.1, # Lower = more focused, Higher = more creative
)
```
### Change Chunk Size
In `ingest.py`:
```python
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=100 # Overlap between chunks
)
```
---
## π Performance
- **Average Response Time**: <4 seconds
- **Embedding Model**: 768-dimensional vectors
- **Vector Search**: FAISS L2 similarity
- **Chunk Strategy**: 1000 chars with 100 char overlap
---
## π€ Skills Demonstrated
This project showcases:
- β
**Generative AI**: LLM integration and prompt engineering
- β
**Vector Databases**: Semantic search with FAISS
- β
**API Development**: RESTful design with FastAPI
- β
**ML Engineering**: Data preprocessing and pipeline optimization
- β
**DevOps**: Containerization and cloud deployment
- β
**Best Practices**: Code structure, documentation, version control
---
## π Troubleshooting
**Issue**: `API key not found`
- **Solution**: Ensure `.env` file exists with `GEMINI_API_KEY="your-key"`
**Issue**: `faiss_index not found`
- **Solution**: Run `python ingest.py` first to create the index
**Issue**: `Module not found`
- **Solution**: Install all dependencies: `pip install -r requirements.txt`
---
## π€ Contact
- GitHub: [@Manavraj-0](https://github.com/Manavraj-0)
- LinkedIn: [Manav Rajvansh](https://linkedin.com/in/meet-manav-rajvansh) |