--- title: MEDChat AI emoji: 🏥 colorFrom: blue colorTo: green sdk: gradio sdk_version: 6.3.0 app_file: app.py pinned: false license: mit --- # MEDChat AI 🏥 A medical chatbot powered by fine-tuned LLaMA 2 for answering medical questions. ## ⚠️ GPU Hardware Required This Space uses a **4-bit quantized LLaMA 2 model** that requires GPU hardware to run inference. ### How to Test This Application: #### Option 1: Upgrade This Space to GPU (Paid) 1. Click **Settings** in the top navigation 2. Select **Space hardware** 3. Choose **T4 GPU** (~$0.60/hour when running) 4. Click **Save** and wait for the Space to restart #### Option 2: Run on Google Colab (Free) ⭐ Recommended 1. Visit the [GitHub Repository](https://github.com/BirukZenebe1/Fine-tunned-Llama-V2) 2. Click on the Colab badge or download the notebook 3. Open in Google Colab 4. Select **Runtime** → **Change runtime type** → **T4 GPU** 5. Run all cells to test the chatbot with free GPU #### Option 3: Watch the Demo Video - [View working demo](your-video-link-here) **Current Status:** This Space is running on **CPU** and will display error messages when attempting to generate responses. The interface is fully functional and can be explored. --- ## Features - 💬 Medical Q&A support using fine-tuned LLaMA 2 - 🔐 User authentication system (demo - in-memory storage) - 🎨 Clean, intuitive Gradio interface - 📚 Fine-tuned on medical terminology dataset - ⚡ 4-bit quantization for efficient inference ## Usage 1. **Sign Up**: Create an account on the Sign Up tab 2. **Login**: Use your credentials to log in 3. **Chat**: Ask medical questions and get AI-powered responses ## Example Questions - What does the immune system do? - What is Epistaxis? - What are allergies? - What's the difference between bacteria and viruses? - Should I start taking creatine? ## Technical Details ### Model Architecture - **Base Model**: LLaMA 2 (7B parameters) - **Fine-tuning**: LoRA (Low-Rank Adaptation) - **Quantization**: 4-bit with bitsandbytes (NF4) - **Dataset**: Medical terminology corpus ### Tech Stack - **Framework**: Gradio 4.0 - **Model Hub**: Hugging Face Transformers - **Fine-tuning**: PEFT (Parameter-Efficient Fine-Tuning) - **Quantization**: bitsandbytes - **Training**: SFTTrainer from TRL library ### Model Configuration ```python - Load in 4-bit: True - Compute dtype: float16 - Quantization type: nf4 - LoRA rank (r): 16 - LoRA alpha: 16 ``` ## Repository Structure ``` ├── app.py # Main Gradio application ├── requirements.txt # Python dependencies ├── README.md # This file └── notebooks/ # Google Colab notebooks └── training.ipynb # Model fine-tuning notebook ``` ## Disclaimer ⚠️ **For educational purposes only**. This chatbot is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical concerns. ## Links - 🔗 [GitHub Repository](https://github.com/yourusername/medchat-ai) - 📹 [Demo Video](your-video-link-here) - 📚 [Hugging Face Model](https://huggingface.co/aboonaji/llama2finetune-v2) ## License MIT License - See LICENSE file for details --- **Questions or Issues?** Open an issue on the [GitHub repository](https://github.com/yourusername/medchat-ai) or reach out via the Community tab. Created as a portfolio project demonstrating LLM fine-tuning, quantization, and deployment techniques.