Spaces:

Subhakanta156
/

Odisha-Disaster-Chatbot

Sleeping

App Files Files Community

Odisha-Disaster-Chatbot / README.md

Subhakanta156

Update README.md

03c774f 5 months ago

preview code

raw

history blame contribute delete

3.72 kB


	---
	title: "Odisha Disaster RAG Chatbot"
	emoji: "🌊"
	colorFrom: green
	colorTo: yellow
	sdk: docker
	app_file: app.py
	pinned: false
	license: mit
	---
	# 🌀 Odisha Disaster Management RAG Chatbot

	## 📌 Overview
	Odisha faces recurring disasters every year such as floods, cyclones, and droughts.
	While the state has a strong disaster management authority (OSDMA), information is often scattered across reports, research papers, and government documents.

	This project builds a Retrieval-Augmented Generation (RAG) based chatbot that provides citizens, researchers, and policymakers with clear, reliable, and contextual answers related to Odisha’s disaster management practices.

	---

	## ✨ Features
	- Handles 132 PDFs and 12 text files (OSDMA, IMD, NDMA, research papers).
	- Preprocessing pipeline: PDF/text extraction, cleaning, normalization, chunking.
	- Embeddings with `sentence-transformers/all-MiniLM-L6-v2`.
	- FAISS Vector Database for fast and efficient retrieval.
	- RAG pipeline:
	1. User query → query structuring (handles poor English, spelling issues).
	2. Retrieve relevant chunks from FAISS.
	3. If no relevant results → no LLM call (saves cost).
	4. If relevant → LLM generates structured, contextual answers.
	- Prompt engineering for better accuracy and reduced hallucinations.
	- Backend: FastAPI.
	- Frontend: HTML, CSS, JS chatbot interface.

	---

	## 🏗️ Architecture

	User Query → Query Structuring → FAISS Retriever → Relevant Chunks → LLM → Answer

	# 🛠️ Tech Stack

	- Python (data handling & backend)
	- PyPDF, TextLoader → PDF/Text extraction
	- FAISS → Vector database
	- HuggingFace Sentence Transformers → Embeddings
	- FastAPI → Backend API
	- HTML, CSS, JavaScript → Frontend chatbot UI
	- LLM (OpenAI / HuggingFace) → Answer generation

	---

	## ⚙️ Installation

	### 1. Clone the repository
	```bash
	git clone https://github.com/subhakanta156/odisha-disaster-knowledge-assistant.git
	```
	### 2. Create virtual environment & install dependencies
	```bash
	python -m venv venv
	source venv/bin/activate # Linux/Mac
	venv\Scripts\activate # Windows

	pip install -r requirements.txt
	```
	### 3. Prepare the data
	- Place all PDFs/text files inside the data/ folder.
	- Run preprocessing & embedding script:
	```bash
	python scripts/build_vector_store.py
	```
	### 4. Run the FastAPI backend
	```bash
	uvicorn app.main:app --reload
	```
	### 5. Open the frontend
	- Open `frontend/index.html` in your browser.

	## 🚀 Usage

	Ask questions like:

	- “How does Odisha’s disaster proneness compare with other Indian states?”
	- “Provide details of relief funds sanctioned for Odisha during the 1999 Super Cyclone.”
	- “Which Odisha agency is primarily responsible for issuing cyclone alerts?”
	- “Explain the key steps taken by the Odisha government if lives are lost in a disaster?”


	The system retrieves relevant chunks from reports and generates reliable, structured answers.

	---

	## 📊 Optimizations

	- Added query filtering → No LLM call if retrieval fails (reduces cost).
	- Handled poor English queries via query restructuring.
	- Improved prompt engineering to minimize hallucinations.

	---

	## 📌 Future Improvements

	- Add multilingual support (Odia/Hindi queries).
	- Deploy on cloud (AWS/GCP/Azure) with Docker.
	- Use advanced embeddings (e.g., `all-mpnet-base-v2`) for higher accuracy.
	- Add real-time updates (e.g., cyclone alerts).

	---

	## 👨‍💻 Author

	Subhakanta Rath

	MSc AI & ML @ IIIT Lucknow

	Passionate about AI/ML, Data Engineering