Medicalchatbot / README.md
brkznb's picture
Update README.md
e4a0d8d verified

A newer version of the Gradio SDK is available: 6.4.0

Upgrade
metadata
title: MEDChat AI
emoji: πŸ₯
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
license: mit

MEDChat AI πŸ₯

A medical chatbot powered by fine-tuned LLaMA 2 for answering medical questions.

⚠️ GPU Hardware Required

This Space uses a 4-bit quantized LLaMA 2 model that requires GPU hardware to run inference.

How to Test This Application:

Option 1: Upgrade This Space to GPU (Paid)

  1. Click Settings in the top navigation
  2. Select Space hardware
  3. Choose T4 GPU (~$0.60/hour when running)
  4. Click Save and wait for the Space to restart

Option 2: Run on Google Colab (Free) ⭐ Recommended

  1. Visit the GitHub Repository
  2. Click on the Colab badge or download the notebook
  3. Open in Google Colab
  4. Select Runtime β†’ Change runtime type β†’ T4 GPU
  5. Run all cells to test the chatbot with free GPU

Option 3: Watch the Demo Video

Current Status: This Space is running on CPU and will display error messages when attempting to generate responses. The interface is fully functional and can be explored.


Features

  • πŸ’¬ Medical Q&A support using fine-tuned LLaMA 2
  • πŸ” User authentication system (demo - in-memory storage)
  • 🎨 Clean, intuitive Gradio interface
  • πŸ“š Fine-tuned on medical terminology dataset
  • ⚑ 4-bit quantization for efficient inference

Usage

  1. Sign Up: Create an account on the Sign Up tab
  2. Login: Use your credentials to log in
  3. Chat: Ask medical questions and get AI-powered responses

Example Questions

  • What does the immune system do?
  • What is Epistaxis?
  • What are allergies?
  • What's the difference between bacteria and viruses?
  • Should I start taking creatine?

Technical Details

Model Architecture

  • Base Model: LLaMA 2 (7B parameters)
  • Fine-tuning: LoRA (Low-Rank Adaptation)
  • Quantization: 4-bit with bitsandbytes (NF4)
  • Dataset: Medical terminology corpus

Tech Stack

  • Framework: Gradio 4.0
  • Model Hub: Hugging Face Transformers
  • Fine-tuning: PEFT (Parameter-Efficient Fine-Tuning)
  • Quantization: bitsandbytes
  • Training: SFTTrainer from TRL library

Model Configuration

- Load in 4-bit: True
- Compute dtype: float16
- Quantization type: nf4
- LoRA rank (r): 16
- LoRA alpha: 16

Repository Structure

β”œβ”€β”€ app.py              # Main Gradio application
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ README.md          # This file
└── notebooks/         # Google Colab notebooks
    └── training.ipynb # Model fine-tuning notebook

Disclaimer

⚠️ For educational purposes only. This chatbot is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical concerns.

Links

License

MIT License - See LICENSE file for details


Questions or Issues? Open an issue on the GitHub repository or reach out via the Community tab.

Created as a portfolio project demonstrating LLM fine-tuning, quantization, and deployment techniques.