---
title: MEDChat AI
emoji: 🏥
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
license: mit
---

# MEDChat AI 🏥

A medical chatbot powered by fine-tuned LLaMA 2 for answering medical questions.

## ⚠️ GPU Hardware Required

This Space uses a **4-bit quantized LLaMA 2 model** that requires GPU hardware to run inference.

### How to Test This Application:

#### Option 1: Upgrade This Space to GPU (Paid)
1. Click **Settings** in the top navigation
2. Select **Space hardware**
3. Choose **T4 GPU** (~$0.60/hour when running)
4. Click **Save** and wait for the Space to restart

#### Option 2: Run on Google Colab (Free) ⭐ Recommended
1. Visit the [GitHub Repository](https://github.com/BirukZenebe1/Fine-tunned-Llama-V2)
2. Click on the Colab badge or download the notebook
3. Open in Google Colab
4. Select **Runtime** → **Change runtime type** → **T4 GPU**
5. Run all cells to test the chatbot with free GPU

#### Option 3: Watch the Demo Video
- [View working demo](your-video-link-here)

**Current Status:** This Space is running on **CPU** and will display error messages when attempting to generate responses. The interface is fully functional and can be explored.

---

## Features

- 💬 Medical Q&A support using fine-tuned LLaMA 2
- 🔐 User authentication system (demo - in-memory storage)
- 🎨 Clean, intuitive Gradio interface
- 📚 Fine-tuned on medical terminology dataset
- ⚡ 4-bit quantization for efficient inference

## Usage

1. **Sign Up**: Create an account on the Sign Up tab
2. **Login**: Use your credentials to log in
3. **Chat**: Ask medical questions and get AI-powered responses

## Example Questions

- What does the immune system do?
- What is Epistaxis?
- What are allergies?
- What's the difference between bacteria and viruses?
- Should I start taking creatine?

## Technical Details

### Model Architecture
- **Base Model**: LLaMA 2 (7B parameters)
- **Fine-tuning**: LoRA (Low-Rank Adaptation)
- **Quantization**: 4-bit with bitsandbytes (NF4)
- **Dataset**: Medical terminology corpus

### Tech Stack
- **Framework**: Gradio 4.0
- **Model Hub**: Hugging Face Transformers
- **Fine-tuning**: PEFT (Parameter-Efficient Fine-Tuning)
- **Quantization**: bitsandbytes
- **Training**: SFTTrainer from TRL library

### Model Configuration
```python
- Load in 4-bit: True
- Compute dtype: float16
- Quantization type: nf4
- LoRA rank (r): 16
- LoRA alpha: 16
```

## Repository Structure
```
├── app.py              # Main Gradio application
├── requirements.txt    # Python dependencies
├── README.md          # This file
└── notebooks/         # Google Colab notebooks
    └── training.ipynb # Model fine-tuning notebook
```

## Disclaimer

⚠️ **For educational purposes only**. This chatbot is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical concerns.

## Links

- 🔗 [GitHub Repository](https://github.com/yourusername/medchat-ai)
- 📹 [Demo Video](your-video-link-here)
- 📚 [Hugging Face Model](https://huggingface.co/aboonaji/llama2finetune-v2)

## License

MIT License - See LICENSE file for details

---

**Questions or Issues?** Open an issue on the [GitHub repository](https://github.com/yourusername/medchat-ai) or reach out via the Community tab.

Created as a portfolio project demonstrating LLM fine-tuning, quantization, and deployment techniques.