Spaces:

eeshanyaj
/

questrag-backend

Sleeping

App Files Files Community

questrag-backend / README.md

eeshanyaj

added yaml format that was missing

006830b 17 days ago

preview code

raw

history blame contribute delete

18.6 kB

metadata

title: QUESTRAG Backend
emoji: 🏦
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860

🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG

An intelligent banking chatbot powered by Retrieval-Augmented Generation (RAG) and Reinforcement Learning (RL) to provide accurate, context-aware responses to Indian banking queries while optimizing token costs.

📋 Table of Contents

Overview
Key Features
System Architecture
Technology Stack
Installation
Configuration
Usage
Project Structure
Datasets
Performance Metrics
API Documentation
Deployment
Contributing
License
Acknowledgments
Contact
Status
Links

🎯 Overview

QUESTRAG is an advanced banking chatbot designed to revolutionize customer support in the Indian banking sector. By combining Retrieval-Augmented Generation (RAG) with Reinforcement Learning (RL), the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, reducing token costs by up to 31% while maintaining high accuracy.

Problem Statement

Existing banking chatbots suffer from:

❌ Limited response flexibility (rigid, rule-based systems)
❌ Poor handling of informal/real-world queries
❌ Lack of contextual understanding
❌ High operational costs due to inefficient token usage
❌ Low user satisfaction and trust

Solution

QUESTRAG addresses these challenges through:

✅ Domain-specific RAG trained on 19,000+ banking queries / support data
✅ RL-optimized policy network (BERT-based) for smart context-fetching decisions
✅ Fine-tuned retriever model (E5-base-v2) using InfoNCE + Triplet Loss
✅ Groq LLM with HuggingFace fallback for reliable, fast responses
✅ Full-stack web application with modern UI/UX and JWT authentication

🌟 Key Features

🤖 Intelligent RAG Pipeline

FAISS-powered retrieval for fast similarity search across 19,352 documents
Fine-tuned embedding model (e5-base-v2) trained on English + Hinglish paraphrases
Context-aware response generation using Llama 3 models (8B & 70B) via Groq

🧠 Reinforcement Learning System

BERT-based policy network (bert-base-uncased) for FETCH/NO_FETCH decisions
Reward-driven optimization (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect)
31% token cost reduction via optimized retrieval

🎨 Modern Web Interface

React 18 + Vite with Tailwind CSS
Real-time chat, conversation history, JWT authentication
Responsive design for desktop and mobile

🔐 Enterprise-Ready Backend

FastAPI + MongoDB Atlas for scalable async operations
JWT authentication with secure password hashing (bcrypt)
Multi-provider LLM (Groq → HuggingFace automatic fallback)
Deployed on HuggingFace Spaces with Docker containerization

🏗️ System Architecture

System Architecture Diagram

🔄 Workflow

User Query → FastAPI receives query via REST API
Policy Decision → BERT-based RL model decides FETCH or NO_FETCH
Conditional Retrieval → If FETCH → Retrieve top-5 docs from FAISS using E5-base-v2
Response Generation → Llama 3 (via Groq) generates final answer
Evaluation & Logging → Logged in MongoDB + reward-based model update

🔄 Sequence Diagram

Sequence Diagram

🛠️ Technology Stack

Frontend

⚛️ React 18.3.1 + Vite 5.4.2
🎨 Tailwind CSS 3.4.1
🔄 React Context API + Axios + React Router DOM

Backend

🚀 FastAPI 0.104.1
🗄️ MongoDB Atlas + Motor (async driver)
🔑 JWT Auth + Passlib (bcrypt)
🤖 PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0
💬 Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile)
🎯 Sentence Transformers 5.1.2

Machine Learning

🧠 Policy Network: BERT-base-uncased (trained with RL)
🔍 Retriever: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss)
📊 Vector Store: FAISS (19,352 documents)

Deployment

🐳 Docker (HuggingFace Spaces)
🤗 HuggingFace Hub (model storage)
☁️ MongoDB Atlas (cloud database)
🌐 Python 3.12 + uvicorn

⚙️ Installation

🧩 Prerequisites

Python 3.12+
Node.js 18+
MongoDB Atlas account (or local MongoDB 6.0+)
Groq API key (or HuggingFace token)

🔧 Backend Setup (Local Development)

# Navigate to backend
cd backend

# Create virtual environment
python -m venv venv

# Activate it
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

# Create environment file
cp .env.example .env
# Edit .env with your credentials (see Configuration section)

# Build FAISS index (one-time setup)
python build_faiss_index.py

# Start backend server
uvicorn app.main:app --reload --port 8000

💻 Frontend Setup

# Navigate to frontend
cd frontend

# Install dependencies
npm install

# Create environment file
cp .env.example .env
# Update VITE_API_URL to point to your backend

# Start dev server
npm run dev

⚙️ Configuration

🔑 Backend `.env` (Key Parameters)

Category	Key	Example / Description
Environment	`ENVIRONMENT`	`development` or `production`
MongoDB	`MONGODB_URI`	`mongodb+srv://user:pass@cluster.mongodb.net/`
Authentication	`SECRET_KEY`	Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"`
	`ALGORITHM`	`HS256`
	`ACCESS_TOKEN_EXPIRE_MINUTES`	`1440` (24 hours)
Groq API	`GROQ_API_KEY_1`	Your primary Groq API key
	`GROQ_API_KEY_2`	Secondary key (optional)
	`GROQ_API_KEY_3`	Tertiary key (optional)
	`GROQ_CHAT_MODEL`	`llama-3.1-8b-instant`
	`GROQ_EVAL_MODEL`	`llama-3.3-70b-versatile`
HuggingFace	`HF_TOKEN_1`	HuggingFace token (fallback LLM)
	`HF_MODEL_REPO`	`eeshanyaj/questrag_models` (for model download)
Model Paths	`POLICY_MODEL_PATH`	`app/models/best_policy_model.pth`
	`RETRIEVER_MODEL_PATH`	`app/models/best_retriever_model.pth`
	`FAISS_INDEX_PATH`	`app/models/faiss_index.pkl`
	`KB_PATH`	`app/data/final_knowledge_base.jsonl`
Device	`DEVICE`	`cpu` or `cuda`
RAG Params	`TOP_K`	`5` (number of documents to retrieve)
	`SIMILARITY_THRESHOLD`	`0.5` (minimum similarity score)
Policy Network	`CONFIDENCE_THRESHOLD`	`0.7` (policy decision confidence)
CORS	`ALLOWED_ORIGINS`	`http://localhost:5173` or `*`

🌐 Frontend `.env`

# Local development
VITE_API_URL=http://localhost:8000

# Production (HuggingFace Spaces)
VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space

🚀 Usage

🖥️ Local Development

Start Backend Server

cd backend
source venv/bin/activate  # or venv\Scripts\activate
uvicorn app.main:app --reload --port 8000

Backend: http://localhost:8000
API Docs: http://localhost:8000/docs
Health Check: http://localhost:8000/health

Start Frontend Dev Server

cd frontend
npm run dev

Frontend: http://localhost:5173

🌐 Production (HuggingFace Spaces)

Backend API:

Base URL: https://eeshanyaj-questrag-backend.hf.space
API Docs: https://eeshanyaj-questrag-backend.hf.space/docs
Health Check: https://eeshanyaj-questrag-backend.hf.space/health

Frontend (Coming Soon):

Will be deployed on Vercel/Netlify

📁 Project Structure

questrag/
│
├── backend/
│   ├── app/
│   │   ├── api/v1/
│   │   │   ├── auth.py              # Auth endpoints (register, login)
│   │   │   └── chat.py              # Chat endpoints
│   │   ├── core/
│   │   │   ├── llm_manager.py       # Groq + HF LLM orchestration
│   │   │   └── security.py          # JWT & password hashing
│   │   ├── ml/
│   │   │   ├── policy_network.py    # RL Policy model (BERT)
│   │   │   └── retriever.py         # E5-base-v2 retriever
│   │   ├── db/
│   │   │   ├── mongodb.py           # MongoDB connection
│   │   │   └── repositories/        # User & conversation repos
│   │   ├── services/
│   │   │   └── chat_service.py      # Orchestration logic
│   │   ├── models/
│   │   │   ├── best_policy_model.pth      # Trained policy network
│   │   │   ├── best_retriever_model.pth   # Fine-tuned retriever
│   │   │   └── faiss_index.pkl            # FAISS vector store
│   │   ├── data/
│   │   │   └── final_knowledge_base.jsonl # 19,352 Q&A pairs
│   │   ├── config.py                # Settings & env vars
│   │   └── main.py                  # FastAPI app entry point
│   ├── Dockerfile                   # Docker config for HF Spaces
│   ├── requirements.txt
│   └── .env.example
│
└── frontend/
    ├── src/
    │   ├── components/              # UI Components
    │   ├── context/                 # Auth Context
    │   ├── pages/                   # Login, Register, Chat
    │   ├── services/api.js          # Axios Client
    │   ├── App.jsx
    │   └── main.jsx
    ├── package.json
    └── .env

📊 Datasets

1. Final Knowledge Base

Size: 19,352 question-answer pairs
Categories: 15 banking categories
Intents: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.)
Source: Combination of:
- Bitext Retail Banking Dataset (Hugging Face)
- RetailBanking-Conversations Dataset
- Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank

2. Retriever Training Dataset

Size: 11,655 paraphrases
Source: 1,665 unique FAQs from knowledge base
Paraphrases per FAQ:
- 4 English paraphrases
- 2 Hinglish paraphrases
- Original FAQ
Training: InfoNCE Loss + Triplet Loss with E5-base-v2

3. Policy Network Training Dataset

Size: 182 queries from 6 chat sessions
Format: (state, action, reward) tuples
Actions: FETCH (1) or NO_FETCH (0)
Rewards: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect)

📈 Performance Metrics

Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.

📚 API Documentation

Authentication

Register

POST /api/v1/auth/register
Content-Type: application/json

{
  "username": "john_doe",
  "email": "john@example.com",
  "password": "securepassword123"
}

Response:

{
  "message": "User registered successfully",
  "user_id": "507f1f77bcf86cd799439011"
}

Login

POST /api/v1/auth/login
Content-Type: application/json

{
  "username": "john_doe",
  "password": "securepassword123"
}

Response:

{
  "access_token": "eyJhbGciOiJIUzI1NiIs...",
  "token_type": "bearer"
}

Chat

Send Message

POST /api/v1/chat/
Authorization: Bearer <token>
Content-Type: application/json

{
  "query": "What are the interest rates for home loans?",
  "conversation_id": "optional-session-id"
}

Response:

{
  "response": "Current home loan interest rates range from 8.5% to 9.5% per annum...",
  "conversation_id": "abc123",
  "metadata": {
    "policy_action": "FETCH",
    "retrieval_score": 0.89,
    "documents_retrieved": 5,
    "llm_provider": "groq"
  }
}

Get Conversation History

GET /api/v1/chat/conversations/{conversation_id}
Authorization: Bearer <token>

Response:

{
  "conversation_id": "abc123",
  "messages": [
    {
      "role": "user",
      "content": "What are the interest rates?",
      "timestamp": "2025-11-28T10:30:00Z"
    },
    {
      "role": "assistant",
      "content": "Current rates are...",
      "timestamp": "2025-11-28T10:30:05Z",
      "metadata": {
        "policy_action": "FETCH"
      }
    }
  ]
}

List All Conversations

GET /api/v1/chat/conversations
Authorization: Bearer <token>

Delete Conversation

DELETE /api/v1/chat/conversation/{conversation_id}
Authorization: Bearer <token>

🚀 Deployment

HuggingFace Spaces (Backend)

The backend is deployed on HuggingFace Spaces using Docker:

Models are stored on HuggingFace Hub: eeshanyaj/questrag_models
On first startup, models are automatically downloaded from HF Hub
Docker container runs FastAPI with uvicorn on port 7860
Environment secrets are securely managed in HF Space settings

Deployment Steps:

# 1. Upload models to HuggingFace Hub
huggingface-cli upload eeshanyaj/questrag_models \
  app/models/best_policy_model.pth \
  models/best_policy_model.pth

# 2. Push backend code to HF Space
git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend
git push space main

# 3. Add environment secrets in HF Space Settings
# (MongoDB URI, Groq keys, JWT secret, etc.)

Frontend Deployment (Vercel/Netlify)

# Build for production
npm run build

# Deploy to Vercel
vercel --prod

# Update .env.production with backend URL
VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow PEP 8 for Python code
Use ESLint + Prettier for JavaScript/React
Write comprehensive docstrings and comments
Add unit tests for new features
Update documentation accordingly

📄 License

MIT License — see LICENSE

🙏 Acknowledgments

Research Inspiration

Main Paper: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024)
Additional References:
- "Evaluating BERT-based Rewards for Question Generation with RL"
- "Self-Reasoning for Retrieval-Augmented Language Models"

Open Source Resources

Datasets

Technologies

📞 Contact

Eeshanya Amit Joshi
📧 Email
💼 LinkedIn

📈 Status

✅ Backend Deployed & Live!

🚀 Backend API running on HuggingFace Spaces
📚 API Documentation available at /docs
💚 Health status: Check here

🚧 Frontend Deployment - Coming Soon!

Will be deployed on Vercel/Netlify
Stay tuned for full application link! ❤️

🔗 Links

Live Backend API: https://eeshanyaj-questrag-backend.hf.space
API Documentation: https://eeshanyaj-questrag-backend.hf.space/docs
Health Check: https://eeshanyaj-questrag-backend.hf.space/health
HuggingFace Space: https://huggingface.co/spaces/eeshanyaj/questrag-backend
Model Repository: https://huggingface.co/eeshanyaj/questrag_models
Research Paper: AAAI 2024 Workshop

✨ Made with ❤️ for the Banking Industry ✨

🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG

📋 Table of Contents

🎯 Overview

Problem Statement

Solution

🌟 Key Features

🤖 Intelligent RAG Pipeline

🧠 Reinforcement Learning System

🎨 Modern Web Interface

🔐 Enterprise-Ready Backend

🏗️ System Architecture

🔄 Workflow

🔄 Sequence Diagram

🛠️ Technology Stack

Frontend

Backend

Machine Learning

Deployment

⚙️ Installation

🧩 Prerequisites

🔧 Backend Setup (Local Development)

💻 Frontend Setup

⚙️ Configuration

🔑 Backend .env (Key Parameters)

🌐 Frontend .env

🚀 Usage

🖥️ Local Development

Start Backend Server

Start Frontend Dev Server

🌐 Production (HuggingFace Spaces)

📁 Project Structure

📊 Datasets

1. Final Knowledge Base

2. Retriever Training Dataset

3. Policy Network Training Dataset

📈 Performance Metrics

📚 API Documentation

Authentication

Register

Login

Chat

Send Message

Get Conversation History

List All Conversations

Delete Conversation

🚀 Deployment

HuggingFace Spaces (Backend)

Frontend Deployment (Vercel/Netlify)

🤝 Contributing

Development Guidelines

📄 License

🙏 Acknowledgments

Research Inspiration

Open Source Resources

Datasets

Technologies

📞 Contact

📈 Status

✅ Backend Deployed & Live!

🚧 Frontend Deployment - Coming Soon!

🔗 Links

🔑 Backend `.env` (Key Parameters)

🌐 Frontend `.env`