Spaces:
Sleeping
Sleeping
| title: Gemini RAG Q&A API | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 8000 | |
| # π€ RAG Q&A API - Intelligent Document Query System | |
| > A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills. | |
| <div style="display: flex; gap: 8px;"> | |
| <a href="https://manavraj-gemini-rag-api.hf.space/docs" target="_blank"> | |
| <img src="https://img.shields.io/badge/API-Try%20it%20Live-green?style=for-the-badge&logo=fastapi" alt="Try the Live API"> | |
| </a> | |
| <a href="https://github.com/Manavraj-0/gemini_rag_api" target="_blank"> | |
| <img src="https://img.shields.io/badge/Code-View%20on%20GitHub-blue?style=for-the-badge&logo=github" alt="View on GitHub"> | |
| </a> | |
| </div> | |
| --- | |
| ## π― Overview | |
| This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data. | |
| ### What is RAG? | |
| RAG (Retrieval-Augmented Generation) combines: | |
| 1. **Retrieval**: Finding relevant document chunks using semantic search | |
| 2. **Augmentation**: Adding retrieved context to the query | |
| 3. **Generation**: Creating accurate, source-backed answers | |
| --- | |
| ## β¨ Key Features | |
| - π§ **Semantic Search**: FAISS vector database for intelligent context retrieval | |
| - β‘ **Fast Responses**: Optimized pipeline with <4s average response time | |
| - π **FastAPI**: Clean API with automatic interactive documentation | |
| - π³ **Docker Ready**: One-command deployment | |
| --- | |
| ## π οΈ Technology Stack | |
| - **LLM**: Google Gemini 2.5 Flash | |
| - **Embeddings**: Google `gemini-embedding-001` | |
| - **Vector DB**: FAISS (CPU) | |
| - **Framework**: LangChain (LCEL) | |
| - **API**: FastAPI + Uvicorn | |
| - **Deployment**: Docker + Hugging Face Spaces | |
| --- | |
| ## π Quick Start | |
| ### Prerequisites | |
| - Python 3.10+ | |
| - Google API Key ([Get one here - Google AI Studio](https://aistudio.google.com/)) | |
| ### Installation | |
| ```bash | |
| # Clone the repository | |
| git clone https://github.com/Manavraj-0/gemini_rag_api.git | |
| cd gemini-rag-api | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Set up environment variables | |
| echo 'GEMINI_API_KEY="your-api-key-here"' > .env | |
| # Create the knowledge base | |
| python ingest.py | |
| # Run the API | |
| uvicorn main:app --reload | |
| ``` | |
| ### Using Docker | |
| ```bash | |
| docker build -t gemini-rag-api . | |
| docker run -p 8000:8000 gemini-rag-api | |
| ``` | |
| --- | |
| ## π API Usage | |
| ### Interactive Documentation | |
| Once running, visit: **http://localhost:8000/docs** | |
| ### Example Request | |
| **Endpoint**: `POST /ask` | |
| ```bash | |
| curl -X POST "http://localhost:8000/ask" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "question": "What is this document about?" | |
| }' | |
| ``` | |
| **Response**: | |
| ```json | |
| { | |
| "question": "What is this document about?", | |
| "answer": "This document discusses...", | |
| "source_documents": [ | |
| "Original text chunk 1...", | |
| "Original text chunk 2..." | |
| ] | |
| } | |
| ``` | |
| ### Available Endpoints | |
| | Method | Endpoint | Description | | |
| |--------|----------|-------------| | |
| | GET | `/` | Welcome message | | |
| | POST | `/ask` | Submit a question and get an answer | | |
| | GET | `/docs` | Interactive API documentation | | |
| --- | |
| ## π Project Structure | |
| ``` | |
| rag_project/ | |
| βββ main.py # FastAPI application & RAG chain | |
| βββ ingest.py # Document processing & indexing | |
| βββ data.txt # Your knowledge base document (change content to explore) | |
| βββ requirements.txt # Python dependencies | |
| βββ Dockerfile # Container configuration | |
| βββ .env # API keys (not committed) | |
| βββ faiss_index/ # Vector database (generated) | |
| ``` | |
| --- | |
| ## π§ Configuration | |
| ### Customize Retrieval | |
| In `main.py`, adjust the retriever: | |
| ```python | |
| retriever = db.as_retriever(search_kwargs={"k": 3}) # Return top 3 results | |
| ``` | |
| ### Adjust Model Temperature | |
| ```python | |
| llm = ChatGoogleGenerativeAI( | |
| model="gemini-2.5-flash", | |
| temperature=0.1, # Lower = more focused, Higher = more creative | |
| ) | |
| ``` | |
| ### Change Chunk Size | |
| In `ingest.py`: | |
| ```python | |
| text_splitter = RecursiveCharacterTextSplitter( | |
| chunk_size=1000, # Characters per chunk | |
| chunk_overlap=100 # Overlap between chunks | |
| ) | |
| ``` | |
| --- | |
| ## π Performance | |
| - **Average Response Time**: <4 seconds | |
| - **Embedding Model**: 768-dimensional vectors | |
| - **Vector Search**: FAISS L2 similarity | |
| - **Chunk Strategy**: 1000 chars with 100 char overlap | |
| --- | |
| ## π€ Skills Demonstrated | |
| This project showcases: | |
| - β **Generative AI**: LLM integration and prompt engineering | |
| - β **Vector Databases**: Semantic search with FAISS | |
| - β **API Development**: RESTful design with FastAPI | |
| - β **ML Engineering**: Data preprocessing and pipeline optimization | |
| - β **DevOps**: Containerization and cloud deployment | |
| - β **Best Practices**: Code structure, documentation, version control | |
| --- | |
| ## π Troubleshooting | |
| **Issue**: `API key not found` | |
| - **Solution**: Ensure `.env` file exists with `GEMINI_API_KEY="your-key"` | |
| **Issue**: `faiss_index not found` | |
| - **Solution**: Run `python ingest.py` first to create the index | |
| **Issue**: `Module not found` | |
| - **Solution**: Install all dependencies: `pip install -r requirements.txt` | |
| --- | |
| ## π€ Contact | |
| - GitHub: [@Manavraj-0](https://github.com/Manavraj-0) | |
| - LinkedIn: [Manav Rajvansh](https://linkedin.com/in/meet-manav-rajvansh) |