Manavraj commited on
Commit
a699672
Β·
verified Β·
1 Parent(s): c46e2fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +203 -7
README.md CHANGED
@@ -1,11 +1,207 @@
1
  ---
2
- title: Gemini Rag Api
3
- emoji: πŸ‘€
4
- colorFrom: yellow
5
- colorTo: gray
6
  sdk: docker
7
- pinned: false
8
- short_description: An Retrieval Augmented Generation API that uses KB
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: RAG Q&A API
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: docker
7
+ app_port: 8000
 
8
  ---
9
 
10
+ # πŸ€– RAG Q&A API - Intelligent Document Query System
11
+
12
+ > A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.
13
+
14
+ [![Live Demo](https://img.shields.io/badge/Demo-Live-success)](https://huggingface.co/spaces/Manavraj/gemini_rag_api)
15
+ [![Python 3.10+](https://img.shields.io/badge/Python-3.10+-blue)](https://www.python.org/)
16
+
17
+ ---
18
+
19
+ ## 🎯 Overview
20
+
21
+ This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.
22
+
23
+ **Built for the WebMob Technologies AI/ML Developer Trainee position**
24
+
25
+ ### What is RAG?
26
+
27
+ RAG (Retrieval-Augmented Generation) combines:
28
+ 1. **Retrieval**: Finding relevant document chunks using semantic search
29
+ 2. **Augmentation**: Adding retrieved context to the query
30
+ 3. **Generation**: Creating accurate, source-backed answers
31
+
32
+ ---
33
+
34
+ ## ✨ Key Features
35
+
36
+ - 🧠 **Semantic Search**: FAISS vector database for intelligent context retrieval
37
+ - ⚑ **Fast Responses**: Optimized pipeline with <4s average response time
38
+ - 🌐 **FastAPI**: Clean API with automatic interactive documentation
39
+ - 🐳 **Docker Ready**: One-command deployment
40
+
41
+ ---
42
+
43
+ ## πŸ› οΈ Technology Stack
44
+
45
+ - **LLM**: Google Gemini 2.5 Flash
46
+ - **Embeddings**: Google `gemini-embedding-001`
47
+ - **Vector DB**: FAISS (CPU)
48
+ - **Framework**: LangChain (LCEL)
49
+ - **API**: FastAPI + Uvicorn
50
+ - **Deployment**: Docker + Hugging Face Spaces
51
+
52
+ ---
53
+
54
+ ## πŸš€ Quick Start
55
+
56
+ ### Prerequisites
57
+ - Python 3.10+
58
+ - Google API Key ([Get one here - Google AI Studio](https://aistudio.google.com/))
59
+
60
+ ### Installation
61
+
62
+ ```bash
63
+ # Clone the repository
64
+ git clone https://github.com/Manavraj-0/gemini_rag_api.git
65
+ cd gemini-rag-api
66
+
67
+ # Install dependencies
68
+ pip install -r requirements.txt
69
+
70
+ # Set up environment variables
71
+ echo 'GEMINI_API_KEY="your-api-key-here"' > .env
72
+
73
+ # Create the knowledge base
74
+ python ingest.py
75
+
76
+ # Run the API
77
+ uvicorn main:app --reload
78
+ ```
79
+
80
+ ### Using Docker
81
+
82
+ ```bash
83
+ docker build -t gemini-rag-api .
84
+ docker run -p 8000:8000 gemini-rag-api
85
+ ```
86
+
87
+ ---
88
+
89
+ ## πŸ“– API Usage
90
+
91
+ ### Interactive Documentation
92
+ Once running, visit: **http://localhost:8000/docs**
93
+
94
+ ### Example Request
95
+
96
+ **Endpoint**: `POST /ask`
97
+
98
+ ```bash
99
+ curl -X POST "http://localhost:8000/ask" \
100
+ -H "Content-Type: application/json" \
101
+ -d '{
102
+ "question": "What is this document about?"
103
+ }'
104
+ ```
105
+
106
+ **Response**:
107
+ ```json
108
+ {
109
+ "question": "What is this document about?",
110
+ "answer": "This document discusses...",
111
+ "source_documents": [
112
+ "Original text chunk 1...",
113
+ "Original text chunk 2..."
114
+ ]
115
+ }
116
+ ```
117
+
118
+ ### Available Endpoints
119
+
120
+ | Method | Endpoint | Description |
121
+ |--------|----------|-------------|
122
+ | GET | `/` | Welcome message |
123
+ | POST | `/ask` | Submit a question and get an answer |
124
+ | GET | `/docs` | Interactive API documentation |
125
+
126
+ ---
127
+
128
+ ## πŸ“ Project Structure
129
+
130
+ ```
131
+ rag_project/
132
+ β”œβ”€β”€ main.py # FastAPI application & RAG chain
133
+ β”œβ”€β”€ ingest.py # Document processing & indexing
134
+ β”œβ”€β”€ data.txt # Your knowledge base document (change content to explore)
135
+ β”œβ”€β”€ requirements.txt # Python dependencies
136
+ β”œβ”€β”€ Dockerfile # Container configuration
137
+ β”œβ”€β”€ .env # API keys (not committed)
138
+ └── faiss_index/ # Vector database (generated)
139
+ ```
140
+
141
+ ---
142
+
143
+ ## πŸ”§ Configuration
144
+
145
+ ### Customize Retrieval
146
+ In `main.py`, adjust the retriever:
147
+ ```python
148
+ retriever = db.as_retriever(search_kwargs={"k": 3}) # Return top 3 results
149
+ ```
150
+
151
+ ### Adjust Model Temperature
152
+ ```python
153
+ llm = ChatGoogleGenerativeAI(
154
+ model="gemini-2.5-flash",
155
+ temperature=0.1, # Lower = more focused, Higher = more creative
156
+ )
157
+ ```
158
+
159
+ ### Change Chunk Size
160
+ In `ingest.py`:
161
+ ```python
162
+ text_splitter = RecursiveCharacterTextSplitter(
163
+ chunk_size=1000, # Characters per chunk
164
+ chunk_overlap=100 # Overlap between chunks
165
+ )
166
+ ```
167
+
168
+ ---
169
+
170
+ ## πŸ“Š Performance
171
+
172
+ - **Average Response Time**: <4 seconds
173
+ - **Embedding Model**: 768-dimensional vectors
174
+ - **Vector Search**: FAISS L2 similarity
175
+ - **Chunk Strategy**: 1000 chars with 100 char overlap
176
+
177
+ ---
178
+
179
+ ## 🀝 Skills Demonstrated
180
+
181
+ This project showcases:
182
+ - βœ… **Generative AI**: LLM integration and prompt engineering
183
+ - βœ… **Vector Databases**: Semantic search with FAISS
184
+ - βœ… **API Development**: RESTful design with FastAPI
185
+ - βœ… **ML Engineering**: Data preprocessing and pipeline optimization
186
+ - βœ… **DevOps**: Containerization and cloud deployment
187
+ - βœ… **Best Practices**: Code structure, documentation, version control
188
+
189
+ ---
190
+
191
+ ## πŸ› Troubleshooting
192
+
193
+ **Issue**: `API key not found`
194
+ - **Solution**: Ensure `.env` file exists with `GEMINI_API_KEY="your-key"`
195
+
196
+ **Issue**: `faiss_index not found`
197
+ - **Solution**: Run `python ingest.py` first to create the index
198
+
199
+ **Issue**: `Module not found`
200
+ - **Solution**: Install all dependencies: `pip install -r requirements.txt`
201
+
202
+ ---
203
+
204
+ ## πŸ‘€ Contact
205
+
206
+ - GitHub: [@Manavraj-0](https://github.com/Manavraj-0)
207
+ - LinkedIn: [Manav Rajvansh](https://linkedin.com/in/meet-manav-rajvansh)