| # UroScribe — Fine-tuned Medical Vision-Language Model | |
| A fine-tuned version of [MedGemma-4B](https://huggingface.co/google/medgemma-4b) specialized for automated urology ultrasound report generation. | |
| ## Model Description | |
| UroScribe takes a urology ultrasound image as input and generates a structured, clinically formatted radiology report. It was fine-tuned to bridge the gap between raw imaging and the time-consuming process of manual report writing in urology practice. | |
| ## Base Model | |
| - **Architecture:** MedGemma-4B (vision-language) | |
| - **Fine-tuning method:** QLoRA (4-bit quantization + LoRA adapters) | |
| - **Training hardware:** NVIDIA H200 | |
| ## Intended Use | |
| - Automated structured report generation from urology ultrasound images | |
| - Clinical decision support tooling | |
| - Research into medical vision-language models for subspecialty radiology | |
| ## Input / Output | |
| **Input:** Urology ultrasound image (PNG) | |
| **Output:** Structured radiology report including findings, impressions, and recommendations | |
| ## Limitations | |
| This model is intended for research and demonstration purposes only. It is not FDA-cleared and should not be used for clinical decision-making without physician oversight. | |
| ## Developed By | |
| Built during Hacklytics 2026 by Benito Karkada & Hadi Malik. |