Spaces:
Sleeping
Sleeping
File size: 4,338 Bytes
5a6cb79 cf50781 3340aa8 cf50781 3340aa8 cf50781 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
title: "JanArogya Chatbot"
emoji: π©Ί
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.0.0"
app_file: app.py
pinned: false
---
# π§ **Cloud-Based Medical QA System**
### βοΈ *AWS-Integrated Machine Learning Project*
---
## π **Project Overview**
This project demonstrates a **cloud-deployed Machine Learning (ML)** application built to answer **medical-related questions** using a fine-tuned QA model.
The system leverages **AWS services** to ensure **scalability**, **accessibility**, and **secure management** of data and infrastructure.
The model was trained and deployed on an **Amazon EC2** instance, with data securely stored in an **Amazon S3** bucket.
Access permissions and security were managed using **AWS IAM (Identity and Access Management)**.
---
## βοΈ **AWS Services Used**
### π **1. AWS Identity and Access Management (IAM)**
**Purpose:**
- π§Ύ Created and managed secure access permissions for different AWS resources.
- π€ Configured a custom IAM user/role with limited access to S3 and EC2.
- π‘οΈ Followed the principle of *least privilege* to ensure minimal security risks.
- π Used IAM for safe credential management during local access testing.
**Alternative (Not Implemented):**
- π§° *AWS Secrets Manager* for automatic credential rotation.
**Reason:** Not necessary for small-scale academic deployment and would increase complexity and cost.
---
### ποΈ **2. Amazon S3 (Simple Storage Service)**
**Purpose:**
- βοΈ Used for storing the `cleaned_medquad.csv` dataset, providing a reliable, cloud-based data source for the ML model.
- π€ The dataset was uploaded manually to an S3 bucket.
- ποΈ Served as a centralized, secure data storage solution.
**Alternative (Not Implemented):**
- π Direct programmatic access using the **Boto3 SDK** to read data from S3 within the EC2 app.
**Reason:** For demonstration purposes, manual upload was sufficient, and integration was skipped to focus on showcasing AWS setup.
---
### π» **3. Amazon EC2 (Elastic Compute Cloud)**
**Purpose:**
- π Used to host and run the ML model and **Gradio interface**.
- βοΈ Configured a `t2.medium` Ubuntu instance for deployment.
- π§© Executed Flask/Gradio app and tested successfully via public URL.
- π Verified full model functionality and response generation.
**Alternative (Not Implemented):**
- πͺΆ *AWS Lambda* or *AWS SageMaker* for serverless or managed ML hosting.
**Reason:** EC2 was ideal for this scale and provided better control over dependencies and environment setup.
---
## π **Deployment Flow**
1. π§Ί Dataset uploaded to **S3 bucket**
2. π IAM role created for **secure access management**
3. π» **EC2 instance** launched and configured
4. π€ ML application (Flask + Gradio) deployed and tested
5. π Logs and results verified on **terminal and Gradio public link**
---
## π§© **Tech Stack**
| Layer | Tools & Technologies |
|-------|----------------------|
| π§ **Backend** | Python (Flask, Gradio) |
| βοΈ **Cloud** | AWS EC2, S3, IAM |
| π§° **Libraries** | Pandas, Transformers, Scikit-learn |
| π **Dataset** | Medical QA Dataset (`cleaned_medquad.csv`) |
---
## πΈ **Screenshots**
π Available in the `/screenshots` folder:
1. π§βπ» IAM Roles and Permissions
2. πͺ£ S3 Bucket with Uploaded Dataset
3. π» EC2 Instance Configuration
4. π§Ύ Terminal Log (Model Running Successfully)
5. π Gradio Interface Screenshot
---
## βοΈ **Future Integration**
Planned upgrades for the next version:
1. π€ Automate dataset retrieval using **Boto3**
2. π Integrate **IAM role-based S3 access** into code
3. π§± Deploy the model using **AWS SageMaker** for managed ML
4. πΎ Store model responses in **Amazon RDS** for persistence
---
## π§Ή **Resource Management**
All AWS resources (**EC2**, **IAM**, and **S3**) have been **safely terminated** after testing to prevent unnecessary billing.
---
## π¨βπ» **Author**
**Aaditya Arvind Ramrame**
π©οΈ *Cloud and Machine Learning Enthusiast*
π§ [aadityaramrame@gmail.com](mailto:aadityaramrame@gmail.com)
π [GitHub Profile](https://github.com/Aadityaramrame)
---
β *βBringing Machine Learning to the Cloud β One Service at a Time.β* βοΈ |