--- title: QUESTRAG Backend emoji: 🏦 colorFrom: blue colorTo: green sdk: docker app_port: 7860 --- # 🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG [![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/) [![FastAPI](https://img.shields.io/badge/FastAPI-0.104.1-green.svg)](https://fastapi.tiangolo.com/) [![React](https://img.shields.io/badge/React-18.3.1-blue.svg)](https://reactjs.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Deployed on HuggingFace](https://img.shields.io/badge/🤗-HuggingFace%20Spaces-yellow)](https://huggingface.co/spaces/eeshanyaj/questrag-backend) > An intelligent banking chatbot powered by **Retrieval-Augmented Generation (RAG)** and **Reinforcement Learning (RL)** to provide accurate, context-aware responses to Indian banking queries while optimizing token costs. --- ## 📋 Table of Contents - [Overview](#overview) - [Key Features](#key-features) - [System Architecture](#system-architecture) - [Technology Stack](#technology-stack) - [Installation](#installation) - [Configuration](#configuration) - [Usage](#usage) - [Project Structure](#project-structure) - [Datasets](#datasets) - [Performance Metrics](#performance-metrics) - [API Documentation](#api-documentation) - [Deployment](#deployment) - [Contributing](#contributing) - [License](#license) - [Acknowledgments](#acknowledgments) - [Contact](#contact) - [Status](#status) - [Links](#links) --- ## 🎯 Overview QUESTRAG is an **advanced banking chatbot** designed to revolutionize customer support in the Indian banking sector. By combining **Retrieval-Augmented Generation (RAG)** with **Reinforcement Learning (RL)**, the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, **reducing token costs by up to 31%** while maintaining high accuracy. ### Problem Statement Existing banking chatbots suffer from: - ❌ Limited response flexibility (rigid, rule-based systems) - ❌ Poor handling of informal/real-world queries - ❌ Lack of contextual understanding - ❌ High operational costs due to inefficient token usage - ❌ Low user satisfaction and trust ### Solution QUESTRAG addresses these challenges through: - ✅ **Domain-specific RAG** trained on 19,000+ banking queries / support data - ✅ **RL-optimized policy network** (BERT-based) for smart context-fetching decisions - ✅ **Fine-tuned retriever model** (E5-base-v2) using InfoNCE + Triplet Loss - ✅ **Groq LLM with HuggingFace fallback** for reliable, fast responses - ✅ **Full-stack web application** with modern UI/UX and JWT authentication --- ## 🌟 Key Features ### 🤖 Intelligent RAG Pipeline - **FAISS-powered retrieval** for fast similarity search across 19,352 documents - **Fine-tuned embedding model** (`e5-base-v2`) trained on English + Hinglish paraphrases - **Context-aware response generation** using Llama 3 models (8B & 70B) via Groq ### 🧠 Reinforcement Learning System - **BERT-based policy network** (`bert-base-uncased`) for FETCH/NO_FETCH decisions - **Reward-driven optimization** (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect) - **31% token cost reduction** via optimized retrieval ### 🎨 Modern Web Interface - **React 18 + Vite** with Tailwind CSS - **Real-time chat**, conversation history, JWT authentication - **Responsive design** for desktop and mobile ### 🔐 Enterprise-Ready Backend - **FastAPI + MongoDB Atlas** for scalable async operations - **JWT authentication** with secure password hashing (bcrypt) - **Multi-provider LLM** (Groq → HuggingFace automatic fallback) - **Deployed on HuggingFace Spaces** with Docker containerization --- ## 🏗️ System Architecture

System Architecture Diagram

### 🔄 Workflow 1. **User Query** → FastAPI receives query via REST API 2. **Policy Decision** → BERT-based RL model decides FETCH or NO_FETCH 3. **Conditional Retrieval** → If FETCH → Retrieve top-5 docs from FAISS using E5-base-v2 4. **Response Generation** → Llama 3 (via Groq) generates final answer 5. **Evaluation & Logging** → Logged in MongoDB + reward-based model update --- ## 🔄 Sequence Diagram

Sequence Diagram

--- ## 🛠️ Technology Stack ### **Frontend** - ⚛️ React 18.3.1 + Vite 5.4.2 - 🎨 Tailwind CSS 3.4.1 - 🔄 React Context API + Axios + React Router DOM ### **Backend** - 🚀 FastAPI 0.104.1 - 🗄️ MongoDB Atlas + Motor (async driver) - 🔑 JWT Auth + Passlib (bcrypt) - 🤖 PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0 - 💬 Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile) - 🎯 Sentence Transformers 5.1.2 ### **Machine Learning** - 🧠 **Policy Network**: BERT-base-uncased (trained with RL) - 🔍 **Retriever**: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss) - 📊 **Vector Store**: FAISS (19,352 documents) ### **Deployment** - 🐳 Docker (HuggingFace Spaces) - 🤗 HuggingFace Hub (model storage) - ☁️ MongoDB Atlas (cloud database) - 🌐 Python 3.12 + uvicorn --- ## ⚙️ Installation ### 🧩 Prerequisites - Python 3.12+ - Node.js 18+ - MongoDB Atlas account (or local MongoDB 6.0+) - Groq API key (or HuggingFace token) ### 🔧 Backend Setup (Local Development) ```bash # Navigate to backend cd backend # Create virtual environment python -m venv venv # Activate it source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows # Install dependencies pip install -r requirements.txt # Create environment file cp .env.example .env # Edit .env with your credentials (see Configuration section) # Build FAISS index (one-time setup) python build_faiss_index.py # Start backend server uvicorn app.main:app --reload --port 8000 ``` ### 💻 Frontend Setup ```bash # Navigate to frontend cd frontend # Install dependencies npm install # Create environment file cp .env.example .env # Update VITE_API_URL to point to your backend # Start dev server npm run dev ``` --- ## ⚙️ Configuration ### 🔑 Backend `.env` (Key Parameters) | **Category** | **Key** | **Example / Description** | |-------------------|----------------------------------|--------------------------------------------------| | Environment | `ENVIRONMENT` | `development` or `production` | | MongoDB | `MONGODB_URI` | `mongodb+srv://user:pass@cluster.mongodb.net/` | | Authentication | `SECRET_KEY` | Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"` | | | `ALGORITHM` | `HS256` | | | `ACCESS_TOKEN_EXPIRE_MINUTES` | `1440` (24 hours) | | Groq API | `GROQ_API_KEY_1` | Your primary Groq API key | | | `GROQ_API_KEY_2` | Secondary key (optional) | | | `GROQ_API_KEY_3` | Tertiary key (optional) | | | `GROQ_CHAT_MODEL` | `llama-3.1-8b-instant` | | | `GROQ_EVAL_MODEL` | `llama-3.3-70b-versatile` | | HuggingFace | `HF_TOKEN_1` | HuggingFace token (fallback LLM) | | | `HF_MODEL_REPO` | `eeshanyaj/questrag_models` (for model download) | | Model Paths | `POLICY_MODEL_PATH` | `app/models/best_policy_model.pth` | | | `RETRIEVER_MODEL_PATH` | `app/models/best_retriever_model.pth` | | | `FAISS_INDEX_PATH` | `app/models/faiss_index.pkl` | | | `KB_PATH` | `app/data/final_knowledge_base.jsonl` | | Device | `DEVICE` | `cpu` or `cuda` | | RAG Params | `TOP_K` | `5` (number of documents to retrieve) | | | `SIMILARITY_THRESHOLD` | `0.5` (minimum similarity score) | | Policy Network | `CONFIDENCE_THRESHOLD` | `0.7` (policy decision confidence) | | CORS | `ALLOWED_ORIGINS` | `http://localhost:5173` or `*` | ### 🌐 Frontend `.env` ```bash # Local development VITE_API_URL=http://localhost:8000 # Production (HuggingFace Spaces) VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space ``` --- ## 🚀 Usage ### 🖥️ Local Development #### Start Backend Server ```bash cd backend source venv/bin/activate # or venv\Scripts\activate uvicorn app.main:app --reload --port 8000 ``` - **Backend**: http://localhost:8000 - **API Docs**: http://localhost:8000/docs - **Health Check**: http://localhost:8000/health #### Start Frontend Dev Server ```bash cd frontend npm run dev ``` - **Frontend**: http://localhost:5173 ### 🌐 Production (HuggingFace Spaces) **Backend API**: - **Base URL**: https://eeshanyaj-questrag-backend.hf.space - **API Docs**: https://eeshanyaj-questrag-backend.hf.space/docs - **Health Check**: https://eeshanyaj-questrag-backend.hf.space/health **Frontend** (Coming Soon): - Will be deployed on Vercel/Netlify --- ## 📁 Project Structure ``` questrag/ │ ├── backend/ │ ├── app/ │ │ ├── api/v1/ │ │ │ ├── auth.py # Auth endpoints (register, login) │ │ │ └── chat.py # Chat endpoints │ │ ├── core/ │ │ │ ├── llm_manager.py # Groq + HF LLM orchestration │ │ │ └── security.py # JWT & password hashing │ │ ├── ml/ │ │ │ ├── policy_network.py # RL Policy model (BERT) │ │ │ └── retriever.py # E5-base-v2 retriever │ │ ├── db/ │ │ │ ├── mongodb.py # MongoDB connection │ │ │ └── repositories/ # User & conversation repos │ │ ├── services/ │ │ │ └── chat_service.py # Orchestration logic │ │ ├── models/ │ │ │ ├── best_policy_model.pth # Trained policy network │ │ │ ├── best_retriever_model.pth # Fine-tuned retriever │ │ │ └── faiss_index.pkl # FAISS vector store │ │ ├── data/ │ │ │ └── final_knowledge_base.jsonl # 19,352 Q&A pairs │ │ ├── config.py # Settings & env vars │ │ └── main.py # FastAPI app entry point │ ├── Dockerfile # Docker config for HF Spaces │ ├── requirements.txt │ └── .env.example │ └── frontend/ ├── src/ │ ├── components/ # UI Components │ ├── context/ # Auth Context │ ├── pages/ # Login, Register, Chat │ ├── services/api.js # Axios Client │ ├── App.jsx │ └── main.jsx ├── package.json └── .env ``` --- ## 📊 Datasets ### 1. Final Knowledge Base - **Size**: 19,352 question-answer pairs - **Categories**: 15 banking categories - **Intents**: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.) - **Source**: Combination of: - Bitext Retail Banking Dataset (Hugging Face) - RetailBanking-Conversations Dataset - Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank ### 2. Retriever Training Dataset - **Size**: 11,655 paraphrases - **Source**: 1,665 unique FAQs from knowledge base - **Paraphrases per FAQ**: - 4 English paraphrases - 2 Hinglish paraphrases - Original FAQ - **Training**: InfoNCE Loss + Triplet Loss with E5-base-v2 ### 3. Policy Network Training Dataset - **Size**: 182 queries from 6 chat sessions - **Format**: (state, action, reward) tuples - **Actions**: FETCH (1) or NO_FETCH (0) - **Rewards**: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect) --- ## 📈 Performance Metrics *Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.* --- ## 📚 API Documentation ### Authentication #### Register ```http POST /api/v1/auth/register Content-Type: application/json { "username": "john_doe", "email": "john@example.com", "password": "securepassword123" } ``` **Response:** ```json { "message": "User registered successfully", "user_id": "507f1f77bcf86cd799439011" } ``` #### Login ```http POST /api/v1/auth/login Content-Type: application/json { "username": "john_doe", "password": "securepassword123" } ``` **Response:** ```json { "access_token": "eyJhbGciOiJIUzI1NiIs...", "token_type": "bearer" } ``` --- ### Chat #### Send Message ```http POST /api/v1/chat/ Authorization: Bearer Content-Type: application/json { "query": "What are the interest rates for home loans?", "conversation_id": "optional-session-id" } ``` **Response:** ```json { "response": "Current home loan interest rates range from 8.5% to 9.5% per annum...", "conversation_id": "abc123", "metadata": { "policy_action": "FETCH", "retrieval_score": 0.89, "documents_retrieved": 5, "llm_provider": "groq" } } ``` #### Get Conversation History ```http GET /api/v1/chat/conversations/{conversation_id} Authorization: Bearer ``` **Response:** ```json { "conversation_id": "abc123", "messages": [ { "role": "user", "content": "What are the interest rates?", "timestamp": "2025-11-28T10:30:00Z" }, { "role": "assistant", "content": "Current rates are...", "timestamp": "2025-11-28T10:30:05Z", "metadata": { "policy_action": "FETCH" } } ] } ``` #### List All Conversations ```http GET /api/v1/chat/conversations Authorization: Bearer ``` #### Delete Conversation ```http DELETE /api/v1/chat/conversation/{conversation_id} Authorization: Bearer ``` --- ## 🚀 Deployment ### HuggingFace Spaces (Backend) The backend is deployed on HuggingFace Spaces using Docker: 1. **Models are stored** on HuggingFace Hub: `eeshanyaj/questrag_models` 2. **On first startup**, models are automatically downloaded from HF Hub 3. **Docker container** runs FastAPI with uvicorn on port 7860 4. **Environment secrets** are securely managed in HF Space settings **Deployment Steps:** ```bash # 1. Upload models to HuggingFace Hub huggingface-cli upload eeshanyaj/questrag_models \ app/models/best_policy_model.pth \ models/best_policy_model.pth # 2. Push backend code to HF Space git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend git push space main # 3. Add environment secrets in HF Space Settings # (MongoDB URI, Groq keys, JWT secret, etc.) ``` ### Frontend Deployment (Vercel/Netlify) ```bash # Build for production npm run build # Deploy to Vercel vercel --prod # Update .env.production with backend URL VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space ``` --- ## 🤝 Contributing Contributions are welcome! Please follow these steps: 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ### Development Guidelines - Follow PEP 8 for Python code - Use ESLint + Prettier for JavaScript/React - Write comprehensive docstrings and comments - Add unit tests for new features - Update documentation accordingly --- ## 📄 License MIT License — see [LICENSE](LICENSE) --- ## 🙏 Acknowledgments ### Research Inspiration - **Main Paper**: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024) - **Additional References**: - "Evaluating BERT-based Rewards for Question Generation with RL" - "Self-Reasoning for Retrieval-Augmented Language Models" ### Open Source Resources - [RL-Self-Improving-RAG](https://github.com/subrata-samanta/RL-Self-Improving-RAG) - [ARENA](https://github.com/ren258/ARENA) - [RAGTechniques](https://github.com/NirDiamant/RAGTechniques) - [Financial-RAG-From-Scratch](https://github.com/cse-amarjeet/Financial-RAG-From-Scratch) ### Datasets - [Bitext Retail Banking Dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset) - [RetailBanking-Conversations](https://huggingface.co/datasets/oopere/RetailBanking-Conversations) ### Technologies - [FastAPI](https://fastapi.tiangolo.com/) - [React](https://reactjs.org/) - [HuggingFace](https://huggingface.co/) - [Groq](https://groq.com/) - [MongoDB Atlas](https://www.mongodb.com/cloud/atlas) --- ## 📞 Contact **Eeshanya Amit Joshi** 📧 [Email](mailto:eeshanyajoshi@gmail.com) 💼 [LinkedIn](https://www.linkedin.com/in/eeshanyajoshi/) --- ## 📈 Status ### ✅ **Backend Deployed & Live!** - 🚀 Backend API running on [HuggingFace Spaces](https://eeshanyaj-questrag-backend.hf.space) - 📚 API Documentation available at [/docs](https://eeshanyaj-questrag-backend.hf.space/docs) - 💚 Health status: [Check here](https://eeshanyaj-questrag-backend.hf.space/health) ### 🚧 **Frontend Deployment - Coming Soon!** - Will be deployed on Vercel/Netlify - Stay tuned for full application link! ❤️ --- ## 🔗 Links - **Live Backend API:** https://eeshanyaj-questrag-backend.hf.space - **API Documentation:** https://eeshanyaj-questrag-backend.hf.space/docs - **Health Check:** https://eeshanyaj-questrag-backend.hf.space/health - **HuggingFace Space:** https://huggingface.co/spaces/eeshanyaj/questrag-backend - **Model Repository:** https://huggingface.co/eeshanyaj/questrag_models - **Research Paper:** [AAAI 2024 Workshop](https://arxiv.org/abs/2401.06800) ---

✨ Made with ❤️ for the Banking Industry ✨

Powered by HuggingFace 🤗| Groq ⚡| MongoDB 🍃| Docker 🐳|