chatbot / README.md
Aadityaramrame's picture
Update README.md
5a6cb79 verified
---
title: "JanArogya Chatbot"
emoji: 🩺
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.0.0"
app_file: app.py
pinned: false
---
# 🧠 **Cloud-Based Medical QA System**
### ☁️ *AWS-Integrated Machine Learning Project*
---
## πŸ“‹ **Project Overview**
This project demonstrates a **cloud-deployed Machine Learning (ML)** application built to answer **medical-related questions** using a fine-tuned QA model.
The system leverages **AWS services** to ensure **scalability**, **accessibility**, and **secure management** of data and infrastructure.
The model was trained and deployed on an **Amazon EC2** instance, with data securely stored in an **Amazon S3** bucket.
Access permissions and security were managed using **AWS IAM (Identity and Access Management)**.
---
## ☁️ **AWS Services Used**
### πŸ” **1. AWS Identity and Access Management (IAM)**
**Purpose:**
- 🧾 Created and managed secure access permissions for different AWS resources.
- πŸ‘€ Configured a custom IAM user/role with limited access to S3 and EC2.
- πŸ›‘οΈ Followed the principle of *least privilege* to ensure minimal security risks.
- πŸ”‘ Used IAM for safe credential management during local access testing.
**Alternative (Not Implemented):**
- 🧰 *AWS Secrets Manager* for automatic credential rotation.
**Reason:** Not necessary for small-scale academic deployment and would increase complexity and cost.
---
### πŸ—ƒοΈ **2. Amazon S3 (Simple Storage Service)**
**Purpose:**
- ☁️ Used for storing the `cleaned_medquad.csv` dataset, providing a reliable, cloud-based data source for the ML model.
- πŸ“€ The dataset was uploaded manually to an S3 bucket.
- πŸ—‚οΈ Served as a centralized, secure data storage solution.
**Alternative (Not Implemented):**
- πŸ”— Direct programmatic access using the **Boto3 SDK** to read data from S3 within the EC2 app.
**Reason:** For demonstration purposes, manual upload was sufficient, and integration was skipped to focus on showcasing AWS setup.
---
### πŸ’» **3. Amazon EC2 (Elastic Compute Cloud)**
**Purpose:**
- πŸš€ Used to host and run the ML model and **Gradio interface**.
- βš™οΈ Configured a `t2.medium` Ubuntu instance for deployment.
- 🧩 Executed Flask/Gradio app and tested successfully via public URL.
- πŸ” Verified full model functionality and response generation.
**Alternative (Not Implemented):**
- πŸͺΆ *AWS Lambda* or *AWS SageMaker* for serverless or managed ML hosting.
**Reason:** EC2 was ideal for this scale and provided better control over dependencies and environment setup.
---
## πŸš€ **Deployment Flow**
1. 🧺 Dataset uploaded to **S3 bucket**
2. πŸ” IAM role created for **secure access management**
3. πŸ’» **EC2 instance** launched and configured
4. πŸ€– ML application (Flask + Gradio) deployed and tested
5. πŸ“ˆ Logs and results verified on **terminal and Gradio public link**
---
## 🧩 **Tech Stack**
| Layer | Tools & Technologies |
|-------|----------------------|
| 🧠 **Backend** | Python (Flask, Gradio) |
| ☁️ **Cloud** | AWS EC2, S3, IAM |
| 🧰 **Libraries** | Pandas, Transformers, Scikit-learn |
| πŸ“Š **Dataset** | Medical QA Dataset (`cleaned_medquad.csv`) |
---
## πŸ“Έ **Screenshots**
πŸ“ Available in the `/screenshots` folder:
1. πŸ§‘β€πŸ’» IAM Roles and Permissions
2. πŸͺ£ S3 Bucket with Uploaded Dataset
3. πŸ’» EC2 Instance Configuration
4. 🧾 Terminal Log (Model Running Successfully)
5. 🌐 Gradio Interface Screenshot
---
## βš™οΈ **Future Integration**
Planned upgrades for the next version:
1. πŸ€– Automate dataset retrieval using **Boto3**
2. πŸ” Integrate **IAM role-based S3 access** into code
3. 🧱 Deploy the model using **AWS SageMaker** for managed ML
4. πŸ’Ύ Store model responses in **Amazon RDS** for persistence
---
## 🧹 **Resource Management**
All AWS resources (**EC2**, **IAM**, and **S3**) have been **safely terminated** after testing to prevent unnecessary billing.
---
## πŸ‘¨β€πŸ’» **Author**
**Aaditya Arvind Ramrame**
🌩️ *Cloud and Machine Learning Enthusiast*
πŸ“§ [aadityaramrame@gmail.com](mailto:aadityaramrame@gmail.com)
πŸ”— [GitHub Profile](https://github.com/Aadityaramrame)
---
⭐ *β€œBringing Machine Learning to the Cloud β€” One Service at a Time.”* ☁️