--- title: "JanArogya Chatbot" emoji: 🩺 colorFrom: blue colorTo: purple sdk: gradio sdk_version: "4.0.0" app_file: app.py pinned: false --- # 🧠 **Cloud-Based Medical QA System** ### ☁️ *AWS-Integrated Machine Learning Project* --- ## πŸ“‹ **Project Overview** This project demonstrates a **cloud-deployed Machine Learning (ML)** application built to answer **medical-related questions** using a fine-tuned QA model. The system leverages **AWS services** to ensure **scalability**, **accessibility**, and **secure management** of data and infrastructure. The model was trained and deployed on an **Amazon EC2** instance, with data securely stored in an **Amazon S3** bucket. Access permissions and security were managed using **AWS IAM (Identity and Access Management)**. --- ## ☁️ **AWS Services Used** ### πŸ” **1. AWS Identity and Access Management (IAM)** **Purpose:** - 🧾 Created and managed secure access permissions for different AWS resources. - πŸ‘€ Configured a custom IAM user/role with limited access to S3 and EC2. - πŸ›‘οΈ Followed the principle of *least privilege* to ensure minimal security risks. - πŸ”‘ Used IAM for safe credential management during local access testing. **Alternative (Not Implemented):** - 🧰 *AWS Secrets Manager* for automatic credential rotation. **Reason:** Not necessary for small-scale academic deployment and would increase complexity and cost. --- ### πŸ—ƒοΈ **2. Amazon S3 (Simple Storage Service)** **Purpose:** - ☁️ Used for storing the `cleaned_medquad.csv` dataset, providing a reliable, cloud-based data source for the ML model. - πŸ“€ The dataset was uploaded manually to an S3 bucket. - πŸ—‚οΈ Served as a centralized, secure data storage solution. **Alternative (Not Implemented):** - πŸ”— Direct programmatic access using the **Boto3 SDK** to read data from S3 within the EC2 app. **Reason:** For demonstration purposes, manual upload was sufficient, and integration was skipped to focus on showcasing AWS setup. --- ### πŸ’» **3. Amazon EC2 (Elastic Compute Cloud)** **Purpose:** - πŸš€ Used to host and run the ML model and **Gradio interface**. - βš™οΈ Configured a `t2.medium` Ubuntu instance for deployment. - 🧩 Executed Flask/Gradio app and tested successfully via public URL. - πŸ” Verified full model functionality and response generation. **Alternative (Not Implemented):** - πŸͺΆ *AWS Lambda* or *AWS SageMaker* for serverless or managed ML hosting. **Reason:** EC2 was ideal for this scale and provided better control over dependencies and environment setup. --- ## πŸš€ **Deployment Flow** 1. 🧺 Dataset uploaded to **S3 bucket** 2. πŸ” IAM role created for **secure access management** 3. πŸ’» **EC2 instance** launched and configured 4. πŸ€– ML application (Flask + Gradio) deployed and tested 5. πŸ“ˆ Logs and results verified on **terminal and Gradio public link** --- ## 🧩 **Tech Stack** | Layer | Tools & Technologies | |-------|----------------------| | 🧠 **Backend** | Python (Flask, Gradio) | | ☁️ **Cloud** | AWS EC2, S3, IAM | | 🧰 **Libraries** | Pandas, Transformers, Scikit-learn | | πŸ“Š **Dataset** | Medical QA Dataset (`cleaned_medquad.csv`) | --- ## πŸ“Έ **Screenshots** πŸ“ Available in the `/screenshots` folder: 1. πŸ§‘β€πŸ’» IAM Roles and Permissions 2. πŸͺ£ S3 Bucket with Uploaded Dataset 3. πŸ’» EC2 Instance Configuration 4. 🧾 Terminal Log (Model Running Successfully) 5. 🌐 Gradio Interface Screenshot --- ## βš™οΈ **Future Integration** Planned upgrades for the next version: 1. πŸ€– Automate dataset retrieval using **Boto3** 2. πŸ” Integrate **IAM role-based S3 access** into code 3. 🧱 Deploy the model using **AWS SageMaker** for managed ML 4. πŸ’Ύ Store model responses in **Amazon RDS** for persistence --- ## 🧹 **Resource Management** All AWS resources (**EC2**, **IAM**, and **S3**) have been **safely terminated** after testing to prevent unnecessary billing. --- ## πŸ‘¨β€πŸ’» **Author** **Aaditya Arvind Ramrame** 🌩️ *Cloud and Machine Learning Enthusiast* πŸ“§ [aadityaramrame@gmail.com](mailto:aadityaramrame@gmail.com) πŸ”— [GitHub Profile](https://github.com/Aadityaramrame) --- ⭐ *β€œBringing Machine Learning to the Cloud β€” One Service at a Time.”* ☁️