Subhakanta156's picture
Update README.md
03c774f
metadata
title: Odisha Disaster RAG Chatbot
emoji: 🌊
colorFrom: green
colorTo: yellow
sdk: docker
app_file: app.py
pinned: false
license: mit

πŸŒ€ Odisha Disaster Management RAG Chatbot

πŸ“Œ Overview

Odisha faces recurring disasters every year such as floods, cyclones, and droughts.
While the state has a strong disaster management authority (OSDMA), information is often scattered across reports, research papers, and government documents.

This project builds a Retrieval-Augmented Generation (RAG) based chatbot that provides citizens, researchers, and policymakers with clear, reliable, and contextual answers related to Odisha’s disaster management practices.


✨ Features

  • Handles 132 PDFs and 12 text files (OSDMA, IMD, NDMA, research papers).
  • Preprocessing pipeline: PDF/text extraction, cleaning, normalization, chunking.
  • Embeddings with sentence-transformers/all-MiniLM-L6-v2.
  • FAISS Vector Database for fast and efficient retrieval.
  • RAG pipeline:
  1. User query β†’ query structuring (handles poor English, spelling issues).
  2. Retrieve relevant chunks from FAISS.
  3. If no relevant results β†’ no LLM call (saves cost).
  4. If relevant β†’ LLM generates structured, contextual answers.
  • Prompt engineering for better accuracy and reduced hallucinations.
  • Backend: FastAPI.
  • Frontend: HTML, CSS, JS chatbot interface.

πŸ—οΈ Architecture

User Query β†’ Query Structuring β†’ FAISS Retriever β†’ Relevant Chunks β†’ LLM β†’ Answer

πŸ› οΈ Tech Stack

  • Python (data handling & backend)
  • PyPDF, TextLoader β†’ PDF/Text extraction
  • FAISS β†’ Vector database
  • HuggingFace Sentence Transformers β†’ Embeddings
  • FastAPI β†’ Backend API
  • HTML, CSS, JavaScript β†’ Frontend chatbot UI
  • LLM (OpenAI / HuggingFace) β†’ Answer generation

βš™οΈ Installation

1. Clone the repository

git clone https://github.com/subhakanta156/odisha-disaster-knowledge-assistant.git

2. Create virtual environment & install dependencies

python -m venv venv
source venv/bin/activate   # Linux/Mac
venv\Scripts\activate      # Windows

pip install -r requirements.txt

3. Prepare the data

  • Place all PDFs/text files inside the data/ folder.
  • Run preprocessing & embedding script:
python scripts/build_vector_store.py

4. Run the FastAPI backend

uvicorn app.main:app --reload

5. Open the frontend

  • Open frontend/index.html in your browser.

πŸš€ Usage

Ask questions like:

  • β€œHow does Odisha’s disaster proneness compare with other Indian states?”
  • β€œProvide details of relief funds sanctioned for Odisha during the 1999 Super Cyclone.”
  • β€œWhich Odisha agency is primarily responsible for issuing cyclone alerts?”
  • β€œExplain the key steps taken by the Odisha government if lives are lost in a disaster?”

The system retrieves relevant chunks from reports and generates reliable, structured answers.


πŸ“Š Optimizations

  • Added query filtering β†’ No LLM call if retrieval fails (reduces cost).
  • Handled poor English queries via query restructuring.
  • Improved prompt engineering to minimize hallucinations.

πŸ“Œ Future Improvements

  • Add multilingual support (Odia/Hindi queries).
  • Deploy on cloud (AWS/GCP/Azure) with Docker.
  • Use advanced embeddings (e.g., all-mpnet-base-v2) for higher accuracy.
  • Add real-time updates (e.g., cyclone alerts).

πŸ‘¨β€πŸ’» Author

Subhakanta Rath

MSc AI & ML @ IIIT Lucknow

Passionate about AI/ML, Data Engineering