chatbot / README.md
Aadityaramrame's picture
Update README.md
5a6cb79 verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: JanArogya Chatbot
emoji: 🩺
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false

🧠 Cloud-Based Medical QA System

☁️ AWS-Integrated Machine Learning Project


πŸ“‹ Project Overview

This project demonstrates a cloud-deployed Machine Learning (ML) application built to answer medical-related questions using a fine-tuned QA model.
The system leverages AWS services to ensure scalability, accessibility, and secure management of data and infrastructure.

The model was trained and deployed on an Amazon EC2 instance, with data securely stored in an Amazon S3 bucket.
Access permissions and security were managed using AWS IAM (Identity and Access Management).


☁️ AWS Services Used

πŸ” 1. AWS Identity and Access Management (IAM)

Purpose:

  • 🧾 Created and managed secure access permissions for different AWS resources.
  • πŸ‘€ Configured a custom IAM user/role with limited access to S3 and EC2.
  • πŸ›‘οΈ Followed the principle of least privilege to ensure minimal security risks.
  • πŸ”‘ Used IAM for safe credential management during local access testing.

Alternative (Not Implemented):

  • 🧰 AWS Secrets Manager for automatic credential rotation.

Reason: Not necessary for small-scale academic deployment and would increase complexity and cost.


πŸ—ƒοΈ 2. Amazon S3 (Simple Storage Service)

Purpose:

  • ☁️ Used for storing the cleaned_medquad.csv dataset, providing a reliable, cloud-based data source for the ML model.
  • πŸ“€ The dataset was uploaded manually to an S3 bucket.
  • πŸ—‚οΈ Served as a centralized, secure data storage solution.

Alternative (Not Implemented):

  • πŸ”— Direct programmatic access using the Boto3 SDK to read data from S3 within the EC2 app.

Reason: For demonstration purposes, manual upload was sufficient, and integration was skipped to focus on showcasing AWS setup.


πŸ’» 3. Amazon EC2 (Elastic Compute Cloud)

Purpose:

  • πŸš€ Used to host and run the ML model and Gradio interface.
  • βš™οΈ Configured a t2.medium Ubuntu instance for deployment.
  • 🧩 Executed Flask/Gradio app and tested successfully via public URL.
  • πŸ” Verified full model functionality and response generation.

Alternative (Not Implemented):

  • πŸͺΆ AWS Lambda or AWS SageMaker for serverless or managed ML hosting.

Reason: EC2 was ideal for this scale and provided better control over dependencies and environment setup.


πŸš€ Deployment Flow

  1. 🧺 Dataset uploaded to S3 bucket
  2. πŸ” IAM role created for secure access management
  3. πŸ’» EC2 instance launched and configured
  4. πŸ€– ML application (Flask + Gradio) deployed and tested
  5. πŸ“ˆ Logs and results verified on terminal and Gradio public link

🧩 Tech Stack

Layer Tools & Technologies
🧠 Backend Python (Flask, Gradio)
☁️ Cloud AWS EC2, S3, IAM
🧰 Libraries Pandas, Transformers, Scikit-learn
πŸ“Š Dataset Medical QA Dataset (cleaned_medquad.csv)

πŸ“Έ Screenshots

πŸ“ Available in the /screenshots folder:

  1. πŸ§‘β€πŸ’» IAM Roles and Permissions
  2. πŸͺ£ S3 Bucket with Uploaded Dataset
  3. πŸ’» EC2 Instance Configuration
  4. 🧾 Terminal Log (Model Running Successfully)
  5. 🌐 Gradio Interface Screenshot

βš™οΈ Future Integration

Planned upgrades for the next version:

  1. πŸ€– Automate dataset retrieval using Boto3
  2. πŸ” Integrate IAM role-based S3 access into code
  3. 🧱 Deploy the model using AWS SageMaker for managed ML
  4. πŸ’Ύ Store model responses in Amazon RDS for persistence

🧹 Resource Management

All AWS resources (EC2, IAM, and S3) have been safely terminated after testing to prevent unnecessary billing.


πŸ‘¨β€πŸ’» Author

Aaditya Arvind Ramrame
🌩️ Cloud and Machine Learning Enthusiast
πŸ“§ aadityaramrame@gmail.com
πŸ”— GitHub Profile


⭐ β€œBringing Machine Learning to the Cloud β€” One Service at a Time.” ☁️