Spaces:

Subhakanta156
/

Odisha-Disaster-Chatbot

Sleeping

App Files Files Community

Odisha-Disaster-Chatbot / README.md

Subhakanta156

Update README.md

03c774f 5 months ago

preview code

raw

history blame contribute delete

3.72 kB

metadata

title: Odisha Disaster RAG Chatbot
emoji: 🌊
colorFrom: green
colorTo: yellow
sdk: docker
app_file: app.py
pinned: false
license: mit

🌀 Odisha Disaster Management RAG Chatbot

📌 Overview

Odisha faces recurring disasters every year such as floods, cyclones, and droughts.
While the state has a strong disaster management authority (OSDMA), information is often scattered across reports, research papers, and government documents.

This project builds a Retrieval-Augmented Generation (RAG) based chatbot that provides citizens, researchers, and policymakers with clear, reliable, and contextual answers related to Odisha’s disaster management practices.

✨ Features

Handles 132 PDFs and 12 text files (OSDMA, IMD, NDMA, research papers).
Preprocessing pipeline: PDF/text extraction, cleaning, normalization, chunking.
Embeddings with sentence-transformers/all-MiniLM-L6-v2.
FAISS Vector Database for fast and efficient retrieval.
RAG pipeline:

User query → query structuring (handles poor English, spelling issues).
Retrieve relevant chunks from FAISS.
If no relevant results → no LLM call (saves cost).
If relevant → LLM generates structured, contextual answers.

Prompt engineering for better accuracy and reduced hallucinations.
Backend: FastAPI.
Frontend: HTML, CSS, JS chatbot interface.

🏗️ Architecture

User Query → Query Structuring → FAISS Retriever → Relevant Chunks → LLM → Answer

🛠️ Tech Stack

Python (data handling & backend)
PyPDF, TextLoader → PDF/Text extraction
FAISS → Vector database
HuggingFace Sentence Transformers → Embeddings
FastAPI → Backend API
HTML, CSS, JavaScript → Frontend chatbot UI
LLM (OpenAI / HuggingFace) → Answer generation

⚙️ Installation

1. Clone the repository

git clone https://github.com/subhakanta156/odisha-disaster-knowledge-assistant.git

2. Create virtual environment & install dependencies

python -m venv venv
source venv/bin/activate   # Linux/Mac
venv\Scripts\activate      # Windows

pip install -r requirements.txt

3. Prepare the data

Place all PDFs/text files inside the data/ folder.
Run preprocessing & embedding script:

python scripts/build_vector_store.py

4. Run the FastAPI backend

uvicorn app.main:app --reload

5. Open the frontend

Open frontend/index.html in your browser.

🚀 Usage

Ask questions like:

“How does Odisha’s disaster proneness compare with other Indian states?”
“Provide details of relief funds sanctioned for Odisha during the 1999 Super Cyclone.”
“Which Odisha agency is primarily responsible for issuing cyclone alerts?”
“Explain the key steps taken by the Odisha government if lives are lost in a disaster?”

The system retrieves relevant chunks from reports and generates reliable, structured answers.

📊 Optimizations

Added query filtering → No LLM call if retrieval fails (reduces cost).
Handled poor English queries via query restructuring.
Improved prompt engineering to minimize hallucinations.

📌 Future Improvements

Add multilingual support (Odia/Hindi queries).
Deploy on cloud (AWS/GCP/Azure) with Docker.
Use advanced embeddings (e.g., all-mpnet-base-v2) for higher accuracy.
Add real-time updates (e.g., cyclone alerts).

👨‍💻 Author

Subhakanta Rath

MSc AI & ML @ IIIT Lucknow

Passionate about AI/ML, Data Engineering