Spaces:

Manavraj
/

gemini_rag_api

Sleeping

App Files Files Community

gemini_rag_api / README.md

Manavraj

Update README.md

926132f verified about 2 months ago

preview code

raw

history blame contribute delete

5.44 kB

metadata

title: Gemini RAG Q&A API
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000

🤖 RAG Q&A API - Intelligent Document Query System

A production-ready Retrieval-Augmented Generation (RAG) API that answers questions using custom knowledge bases. Built to demonstrate enterprise-grade AI/ML development skills.

🎯 Overview

This project implements a RAG system that answers questions about custom documents using natural language. It retrieves relevant context from your documents before generating answers, ensuring responses are accurate and grounded in your data.

What is RAG?

RAG (Retrieval-Augmented Generation) combines:

Retrieval: Finding relevant document chunks using semantic search
Augmentation: Adding retrieved context to the query
Generation: Creating accurate, source-backed answers

✨ Key Features

🧠 Semantic Search: FAISS vector database for intelligent context retrieval
⚡ Fast Responses: Optimized pipeline with <4s average response time
🌐 FastAPI: Clean API with automatic interactive documentation
🐳 Docker Ready: One-command deployment

🛠️ Technology Stack

LLM: Google Gemini 2.5 Flash
Embeddings: Google gemini-embedding-001
Vector DB: FAISS (CPU)
Framework: LangChain (LCEL)
API: FastAPI + Uvicorn
Deployment: Docker + Hugging Face Spaces

🚀 Quick Start

Prerequisites

Python 3.10+
Google API Key (Get one here - Google AI Studio)

Installation

# Clone the repository
git clone https://github.com/Manavraj-0/gemini_rag_api.git
cd gemini-rag-api

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
echo 'GEMINI_API_KEY="your-api-key-here"' > .env

# Create the knowledge base
python ingest.py

# Run the API
uvicorn main:app --reload

Using Docker

docker build -t gemini-rag-api .
docker run -p 8000:8000 gemini-rag-api

📖 API Usage

Interactive Documentation

Once running, visit: http://localhost:8000/docs

Example Request

Endpoint: POST /ask

curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is this document about?"
  }'

Response:

{
  "question": "What is this document about?",
  "answer": "This document discusses...",
  "source_documents": [
    "Original text chunk 1...",
    "Original text chunk 2..."
  ]
}

Available Endpoints

Method	Endpoint	Description
GET	`/`	Welcome message
POST	`/ask`	Submit a question and get an answer
GET	`/docs`	Interactive API documentation

📁 Project Structure

rag_project/
├── main.py              # FastAPI application & RAG chain
├── ingest.py            # Document processing & indexing
├── data.txt             # Your knowledge base document (change content to explore)
├── requirements.txt     # Python dependencies
├── Dockerfile           # Container configuration
├── .env                 # API keys (not committed)
└── faiss_index/         # Vector database (generated)

🔧 Configuration

Customize Retrieval

In main.py, adjust the retriever:

retriever = db.as_retriever(search_kwargs={"k": 3})  # Return top 3 results

Adjust Model Temperature

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0.1,  # Lower = more focused, Higher = more creative
)

Change Chunk Size

In ingest.py:

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,    # Characters per chunk
    chunk_overlap=100   # Overlap between chunks
)

📊 Performance

Average Response Time: <4 seconds
Embedding Model: 768-dimensional vectors
Vector Search: FAISS L2 similarity
Chunk Strategy: 1000 chars with 100 char overlap

🤝 Skills Demonstrated

This project showcases:

✅ Generative AI: LLM integration and prompt engineering
✅ Vector Databases: Semantic search with FAISS
✅ API Development: RESTful design with FastAPI
✅ ML Engineering: Data preprocessing and pipeline optimization
✅ DevOps: Containerization and cloud deployment
✅ Best Practices: Code structure, documentation, version control

🐛 Troubleshooting

Issue: API key not found

Solution: Ensure .env file exists with GEMINI_API_KEY="your-key"

Issue: faiss_index not found

Solution: Run python ingest.py first to create the index

Issue: Module not found

Solution: Install all dependencies: pip install -r requirements.txt

👤 Contact

GitHub: @Manavraj-0
LinkedIn: Manav Rajvansh