Spaces:
Running
A newer version of the Gradio SDK is available:
6.4.0
title: MEDChat AI
emoji: π₯
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
license: mit
MEDChat AI π₯
A medical chatbot powered by fine-tuned LLaMA 2 for answering medical questions.
β οΈ GPU Hardware Required
This Space uses a 4-bit quantized LLaMA 2 model that requires GPU hardware to run inference.
How to Test This Application:
Option 1: Upgrade This Space to GPU (Paid)
- Click Settings in the top navigation
- Select Space hardware
- Choose T4 GPU (~$0.60/hour when running)
- Click Save and wait for the Space to restart
Option 2: Run on Google Colab (Free) β Recommended
- Visit the GitHub Repository
- Click on the Colab badge or download the notebook
- Open in Google Colab
- Select Runtime β Change runtime type β T4 GPU
- Run all cells to test the chatbot with free GPU
Option 3: Watch the Demo Video
Current Status: This Space is running on CPU and will display error messages when attempting to generate responses. The interface is fully functional and can be explored.
Features
- π¬ Medical Q&A support using fine-tuned LLaMA 2
- π User authentication system (demo - in-memory storage)
- π¨ Clean, intuitive Gradio interface
- π Fine-tuned on medical terminology dataset
- β‘ 4-bit quantization for efficient inference
Usage
- Sign Up: Create an account on the Sign Up tab
- Login: Use your credentials to log in
- Chat: Ask medical questions and get AI-powered responses
Example Questions
- What does the immune system do?
- What is Epistaxis?
- What are allergies?
- What's the difference between bacteria and viruses?
- Should I start taking creatine?
Technical Details
Model Architecture
- Base Model: LLaMA 2 (7B parameters)
- Fine-tuning: LoRA (Low-Rank Adaptation)
- Quantization: 4-bit with bitsandbytes (NF4)
- Dataset: Medical terminology corpus
Tech Stack
- Framework: Gradio 4.0
- Model Hub: Hugging Face Transformers
- Fine-tuning: PEFT (Parameter-Efficient Fine-Tuning)
- Quantization: bitsandbytes
- Training: SFTTrainer from TRL library
Model Configuration
- Load in 4-bit: True
- Compute dtype: float16
- Quantization type: nf4
- LoRA rank (r): 16
- LoRA alpha: 16
Repository Structure
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ notebooks/ # Google Colab notebooks
βββ training.ipynb # Model fine-tuning notebook
Disclaimer
β οΈ For educational purposes only. This chatbot is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical concerns.
Links
- π GitHub Repository
- πΉ Demo Video
- π Hugging Face Model
License
MIT License - See LICENSE file for details
Questions or Issues? Open an issue on the GitHub repository or reach out via the Community tab.
Created as a portfolio project demonstrating LLM fine-tuning, quantization, and deployment techniques.