title: Odisha Disaster RAG Chatbot
emoji: π
colorFrom: green
colorTo: yellow
sdk: docker
app_file: app.py
pinned: false
license: mit
π Odisha Disaster Management RAG Chatbot
π Overview
Odisha faces recurring disasters every year such as floods, cyclones, and droughts.
While the state has a strong disaster management authority (OSDMA), information is often scattered across reports, research papers, and government documents.
This project builds a Retrieval-Augmented Generation (RAG) based chatbot that provides citizens, researchers, and policymakers with clear, reliable, and contextual answers related to Odishaβs disaster management practices.
β¨ Features
- Handles 132 PDFs and 12 text files (OSDMA, IMD, NDMA, research papers).
- Preprocessing pipeline: PDF/text extraction, cleaning, normalization, chunking.
- Embeddings with
sentence-transformers/all-MiniLM-L6-v2. - FAISS Vector Database for fast and efficient retrieval.
- RAG pipeline:
- User query β query structuring (handles poor English, spelling issues).
- Retrieve relevant chunks from FAISS.
- If no relevant results β no LLM call (saves cost).
- If relevant β LLM generates structured, contextual answers.
- Prompt engineering for better accuracy and reduced hallucinations.
- Backend: FastAPI.
- Frontend: HTML, CSS, JS chatbot interface.
ποΈ Architecture
User Query β Query Structuring β FAISS Retriever β Relevant Chunks β LLM β Answer
π οΈ Tech Stack
- Python (data handling & backend)
- PyPDF, TextLoader β PDF/Text extraction
- FAISS β Vector database
- HuggingFace Sentence Transformers β Embeddings
- FastAPI β Backend API
- HTML, CSS, JavaScript β Frontend chatbot UI
- LLM (OpenAI / HuggingFace) β Answer generation
βοΈ Installation
1. Clone the repository
git clone https://github.com/subhakanta156/odisha-disaster-knowledge-assistant.git
2. Create virtual environment & install dependencies
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install -r requirements.txt
3. Prepare the data
- Place all PDFs/text files inside the data/ folder.
- Run preprocessing & embedding script:
python scripts/build_vector_store.py
4. Run the FastAPI backend
uvicorn app.main:app --reload
5. Open the frontend
- Open
frontend/index.htmlin your browser.
π Usage
Ask questions like:
- βHow does Odishaβs disaster proneness compare with other Indian states?β
- βProvide details of relief funds sanctioned for Odisha during the 1999 Super Cyclone.β
- βWhich Odisha agency is primarily responsible for issuing cyclone alerts?β
- βExplain the key steps taken by the Odisha government if lives are lost in a disaster?β
The system retrieves relevant chunks from reports and generates reliable, structured answers.
π Optimizations
- Added query filtering β No LLM call if retrieval fails (reduces cost).
- Handled poor English queries via query restructuring.
- Improved prompt engineering to minimize hallucinations.
π Future Improvements
- Add multilingual support (Odia/Hindi queries).
- Deploy on cloud (AWS/GCP/Azure) with Docker.
- Use advanced embeddings (e.g.,
all-mpnet-base-v2) for higher accuracy. - Add real-time updates (e.g., cyclone alerts).
π¨βπ» Author
Subhakanta Rath
MSc AI & ML @ IIIT Lucknow
Passionate about AI/ML, Data Engineering