Spaces:
Sleeping
Sleeping
metadata
title: Gemini RAG Q&A API
emoji: π€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
π€ RAG Q&A API - Intelligent Document Query System
A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.
π― Overview
This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.
What is RAG?
RAG (Retrieval-Augmented Generation) combines:
- Retrieval: Finding relevant document chunks using semantic search
- Augmentation: Adding retrieved context to the query
- Generation: Creating accurate, source-backed answers
β¨ Key Features
- π§ Semantic Search: FAISS vector database for intelligent context retrieval
- β‘ Fast Responses: Optimized pipeline with <4s average response time
- π FastAPI: Clean API with automatic interactive documentation
- π³ Docker Ready: One-command deployment
π οΈ Technology Stack
- LLM: Google Gemini 2.5 Flash
- Embeddings: Google
gemini-embedding-001 - Vector DB: FAISS (CPU)
- Framework: LangChain (LCEL)
- API: FastAPI + Uvicorn
- Deployment: Docker + Hugging Face Spaces
π Quick Start
Prerequisites
- Python 3.10+
- Google API Key (Get one here - Google AI Studio)
Installation
# Clone the repository
git clone https://github.com/Manavraj-0/gemini_rag_api.git
cd gemini-rag-api
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
echo 'GEMINI_API_KEY="your-api-key-here"' > .env
# Create the knowledge base
python ingest.py
# Run the API
uvicorn main:app --reload
Using Docker
docker build -t gemini-rag-api .
docker run -p 8000:8000 gemini-rag-api
π API Usage
Interactive Documentation
Once running, visit: http://localhost:8000/docs
Example Request
Endpoint: POST /ask
curl -X POST "http://localhost:8000/ask" \
-H "Content-Type: application/json" \
-d '{
"question": "What is this document about?"
}'
Response:
{
"question": "What is this document about?",
"answer": "This document discusses...",
"source_documents": [
"Original text chunk 1...",
"Original text chunk 2..."
]
}
Available Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Welcome message |
| POST | /ask |
Submit a question and get an answer |
| GET | /docs |
Interactive API documentation |
π Project Structure
rag_project/
βββ main.py # FastAPI application & RAG chain
βββ ingest.py # Document processing & indexing
βββ data.txt # Your knowledge base document (change content to explore)
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ .env # API keys (not committed)
βββ faiss_index/ # Vector database (generated)
π§ Configuration
Customize Retrieval
In main.py, adjust the retriever:
retriever = db.as_retriever(search_kwargs={"k": 3}) # Return top 3 results
Adjust Model Temperature
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
temperature=0.1, # Lower = more focused, Higher = more creative
)
Change Chunk Size
In ingest.py:
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=100 # Overlap between chunks
)
π Performance
- Average Response Time: <4 seconds
- Embedding Model: 768-dimensional vectors
- Vector Search: FAISS L2 similarity
- Chunk Strategy: 1000 chars with 100 char overlap
π€ Skills Demonstrated
This project showcases:
- β Generative AI: LLM integration and prompt engineering
- β Vector Databases: Semantic search with FAISS
- β API Development: RESTful design with FastAPI
- β ML Engineering: Data preprocessing and pipeline optimization
- β DevOps: Containerization and cloud deployment
- β Best Practices: Code structure, documentation, version control
π Troubleshooting
Issue: API key not found
- Solution: Ensure
.envfile exists withGEMINI_API_KEY="your-key"
Issue: faiss_index not found
- Solution: Run
python ingest.pyfirst to create the index
Issue: Module not found
- Solution: Install all dependencies:
pip install -r requirements.txt
π€ Contact
- GitHub: @Manavraj-0
- LinkedIn: Manav Rajvansh