File size: 5,444 Bytes
4511fb2
 
 
 
 
4a0da40
 
4511fb2
 
a699672
 
 
 
2f218a2
926132f
2f218a2
 
 
 
 
 
a699672
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
---
title: Gemini RAG Q&A API
emoji: πŸ€–
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
---

# πŸ€– RAG Q&A API - Intelligent Document Query System

> A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.

<div style="display: flex; gap: 8px;">
  <a href="https://manavraj-gemini-rag-api.hf.space/docs" target="_blank">
    <img src="https://img.shields.io/badge/API-Try%20it%20Live-green?style=for-the-badge&logo=fastapi" alt="Try the Live API">
  </a>
  <a href="https://github.com/Manavraj-0/gemini_rag_api" target="_blank">
    <img src="https://img.shields.io/badge/Code-View%20on%20GitHub-blue?style=for-the-badge&logo=github" alt="View on GitHub">
  </a>
</div>

---

## 🎯 Overview

This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.

### What is RAG?

RAG (Retrieval-Augmented Generation) combines:
1. **Retrieval**: Finding relevant document chunks using semantic search
2. **Augmentation**: Adding retrieved context to the query
3. **Generation**: Creating accurate, source-backed answers

---

## ✨ Key Features

- 🧠 **Semantic Search**: FAISS vector database for intelligent context retrieval
- ⚑ **Fast Responses**: Optimized pipeline with <4s average response time
- 🌐 **FastAPI**: Clean API with automatic interactive documentation
- 🐳 **Docker Ready**: One-command deployment

---

## πŸ› οΈ Technology Stack

- **LLM**: Google Gemini 2.5 Flash
- **Embeddings**: Google `gemini-embedding-001`
- **Vector DB**: FAISS (CPU)
- **Framework**: LangChain (LCEL)
- **API**: FastAPI + Uvicorn
- **Deployment**: Docker + Hugging Face Spaces

---

## πŸš€ Quick Start

### Prerequisites
- Python 3.10+
- Google API Key ([Get one here - Google AI Studio](https://aistudio.google.com/))

### Installation

```bash
# Clone the repository
git clone https://github.com/Manavraj-0/gemini_rag_api.git
cd gemini-rag-api

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
echo 'GEMINI_API_KEY="your-api-key-here"' > .env

# Create the knowledge base
python ingest.py

# Run the API
uvicorn main:app --reload
```

### Using Docker

```bash
docker build -t gemini-rag-api .
docker run -p 8000:8000 gemini-rag-api
```

---

## πŸ“– API Usage

### Interactive Documentation
Once running, visit: **http://localhost:8000/docs**

### Example Request

**Endpoint**: `POST /ask`

```bash
curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is this document about?"
  }'
```

**Response**:
```json
{
  "question": "What is this document about?",
  "answer": "This document discusses...",
  "source_documents": [
    "Original text chunk 1...",
    "Original text chunk 2..."
  ]
}
```

### Available Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/` | Welcome message |
| POST | `/ask` | Submit a question and get an answer |
| GET | `/docs` | Interactive API documentation |

---

## πŸ“ Project Structure

```
rag_project/
β”œβ”€β”€ main.py              # FastAPI application & RAG chain
β”œβ”€β”€ ingest.py            # Document processing & indexing
β”œβ”€β”€ data.txt             # Your knowledge base document (change content to explore)
β”œβ”€β”€ requirements.txt     # Python dependencies
β”œβ”€β”€ Dockerfile           # Container configuration
β”œβ”€β”€ .env                 # API keys (not committed)
└── faiss_index/         # Vector database (generated)
```

---

## πŸ”§ Configuration

### Customize Retrieval
In `main.py`, adjust the retriever:
```python
retriever = db.as_retriever(search_kwargs={"k": 3})  # Return top 3 results
```

### Adjust Model Temperature
```python
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0.1,  # Lower = more focused, Higher = more creative
)
```

### Change Chunk Size
In `ingest.py`:
```python
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,    # Characters per chunk
    chunk_overlap=100   # Overlap between chunks
)
```

---

## πŸ“Š Performance

- **Average Response Time**: <4 seconds
- **Embedding Model**: 768-dimensional vectors
- **Vector Search**: FAISS L2 similarity
- **Chunk Strategy**: 1000 chars with 100 char overlap

---

## 🀝 Skills Demonstrated

This project showcases:
- βœ… **Generative AI**: LLM integration and prompt engineering
- βœ… **Vector Databases**: Semantic search with FAISS
- βœ… **API Development**: RESTful design with FastAPI
- βœ… **ML Engineering**: Data preprocessing and pipeline optimization
- βœ… **DevOps**: Containerization and cloud deployment
- βœ… **Best Practices**: Code structure, documentation, version control

---

## πŸ› Troubleshooting

**Issue**: `API key not found`
- **Solution**: Ensure `.env` file exists with `GEMINI_API_KEY="your-key"`

**Issue**: `faiss_index not found`
- **Solution**: Run `python ingest.py` first to create the index

**Issue**: `Module not found`
- **Solution**: Install all dependencies: `pip install -r requirements.txt`

---

## πŸ‘€ Contact

- GitHub: [@Manavraj-0](https://github.com/Manavraj-0)
- LinkedIn: [Manav Rajvansh](https://linkedin.com/in/meet-manav-rajvansh)