--- title: Medical Image Segmentation - GI Tract emoji: 🏥 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: "4.11.0" python_version: "3.9" app_file: app.py pinned: false --- # 🏥 Medical Image Segmentation - UW-Madison GI Tract ![License](https://img.shields.io/badge/license-MIT-blue.svg) ![Python](https://img.shields.io/badge/python-3.8+-green.svg) ![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg) ![Status](https://img.shields.io/badge/status-production--ready-success.svg) > Automated semantic segmentation of gastrointestinal tract organs in medical CT/MRI images using SegFormer and Gradio web interface. ## 📋 Table of Contents - [Overview](#overview) - [Features](#features) - [Installation](#installation) - [Quick Start](#quick-start) - [Usage](#usage) - [Project Structure](#project-structure) - [Model Details](#model-details) - [Training](#training) - [API Reference](#api-reference) - [Contributing](#contributing) - [License](#license) ## 📊 Overview This project provides an end-to-end solution for segmenting GI tract organs in medical images: - **Stomach** - **Large Bowel** - **Small Bowel** Built with state-of-the-art SegFormer architecture and trained on the UW-Madison GI Tract Image Segmentation dataset (45K+ images). ### Key Achievements - ✅ 64M parameter efficient model - ✅ Interactive Gradio web interface - ✅ Real-time inference on CPU/GPU - ✅ 40+ pre-loaded sample images - ✅ Complete training pipeline included - ✅ Production-ready code ## ✨ Features ### Core Capabilities - **Web Interface**: Upload images and get instant segmentation predictions - **Batch Processing**: Test on multiple images simultaneously - **Color-Coded Output**: Intuitive visual representation of organ locations - **Confidence Scores**: Pixel-level confidence metrics for each organ - **Interactive Notebook**: Educational Jupyter notebook with step-by-step examples ### Development Tools - Data download automation (Kaggle integration) - Dataset preparation and preprocessing - Model training with validation - Comprehensive evaluation metrics - Diagnostic system checker - Simple testing without ground truth ## 🚀 Installation ### Requirements - Python 3.8 or higher - CUDA 11.8+ (optional, for GPU acceleration) - 4GB RAM minimum (8GB recommended) - 2GB disk space ### Step 1: Clone Repository ```bash git clone https://github.com/hung2903/medical-image-segmentation.git cd UWMGI_Medical_Image_Segmentation ``` ### Step 2: Create Virtual Environment ```bash # Using venv python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Or using conda conda create -n medseg python=3.10 conda activate medseg ``` ### Step 3: Install Dependencies ```bash pip install -r requirements.txt ``` ### Step 4: Verify Installation ```bash python diagnose.py ``` All checks should show ✅ PASSED. ## 🎯 Quick Start ### 1. Run Web Interface (Easiest) ```bash python app.py ``` Then open http://127.0.0.1:7860 in your browser. ### 2. Test on Sample Images ```bash python test_simple.py \ --model segformer_trained_weights \ --images samples \ --output-dir results ``` ### 3. Interactive Jupyter Notebook ```bash jupyter notebook demo.ipynb ``` ## 📖 Usage ### Web Interface 1. Launch: `python app.py` 2. Upload medical image (PNG/JPG) 3. Click "Generate Predictions" 4. View color-coded segmentation with confidence scores 5. Download result image **Supported Formats**: PNG, JPG, JPEG, GIF, BMP, WEBP ### Command Line ```python from app import get_model, predict import torch from PIL import Image # Load model device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = get_model(device) # Load image image = Image.open('sample.png') # Get predictions output_image, confidence_info = predict(image) ``` ### Python API ```python import torch from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = SegformerForSemanticSegmentation.from_pretrained( 'segformer_trained_weights' ).to(device) processor = SegformerImageProcessor() # Process image image_input = processor(image, return_tensors='pt').to(device) outputs = model(**image_input) logits = outputs.logits ``` ## 📁 Project Structure ``` . ├── app.py # Gradio web interface ├── train.py # Model training script ├── test.py # Comprehensive evaluation ├── test_simple.py # Simple testing without ground truth ├── download_dataset.py # Kaggle dataset download ├── prepare_dataset.py # Data preprocessing ├── diagnose.py # System diagnostics ├── demo.ipynb # Interactive notebook ├── requirements.txt # Python dependencies ├── LICENSE # MIT License ├── README.md # This file ├── TRAINING_GUIDE.md # Detailed training instructions ├── IMPLEMENTATION_SUMMARY.md # Technical details ├── FILE_INDEX.md # File navigation guide ├── samples/ # 40 pre-loaded sample images ├── segformer_trained_weights/ # Pre-trained model │ ├── config.json │ └── pytorch_model.bin └── test_results_simple/ # Test outputs ``` ## 🧠 Model Details ### Architecture - **Model**: SegFormer-B0 - **Framework**: HuggingFace Transformers - **Pre-training**: Cityscapes dataset - **Fine-tuning**: UW-Madison GI Tract Dataset ### Specifications | Aspect | Value | |--------|-------| | Input Size | 288 × 288 pixels | | Output Classes | 4 (background + 3 organs) | | Parameters | 64M | | Model Size | 256 MB | | Inference Time | ~500ms (CPU), ~100ms (GPU) | ### Normalization ``` Mean: [0.485, 0.456, 0.406] Std: [0.229, 0.224, 0.225] ``` (ImageNet standard) ## 🎓 Training ### Download Full Dataset ```bash # Requires Kaggle API key setup python download_dataset.py ``` ### Prepare Data ```bash python prepare_dataset.py \ --data-dir /path/to/downloaded/data \ --output-dir prepared_data ``` ### Train Model ```bash python train.py \ --epochs 20 \ --batch-size 16 \ --learning-rate 1e-4 \ --train-dir prepared_data/train_images \ --val-dir prepared_data/val_images ``` ### Evaluate ```bash python test.py \ --model models/best_model \ --test-images prepared_data/test_images \ --test-masks prepared_data/test_masks \ --visualize ``` See [TRAINING_GUIDE.md](TRAINING_GUIDE.md) for detailed instructions. ## 📡 API Reference ### app.py ```python def predict(image: Image.Image) -> Tuple[Image.Image, str]: """Perform segmentation on input image.""" def get_model(device: torch.device) -> SegformerForSemanticSegmentation: """Load pre-trained model.""" ``` ### test_simple.py ```python class SimpleSegmentationTester: def test_batch(self, image_paths: List[str]) -> Dict: """Segment multiple images.""" ``` ### train.py ```python class MedicalImageSegmentationTrainer: def train(self, num_epochs: int) -> None: """Train model with validation.""" ``` ## 🔄 Preprocessing Pipeline 1. **Image Resize**: 288 × 288 2. **Normalization**: ImageNet standard (mean/std) 3. **Tensor Conversion**: Convert to PyTorch tensors 4. **Device Transfer**: Move to GPU/CPU ## 📊 Output Format ### Web Interface - Colored overlay image (red/green/blue for organs) - Confidence percentages per organ - Downloadable result image ### JSON Output (test_simple.py) ```json { "case101_day26": { "large_bowel_pixels": 244, "small_bowel_pixels": 1901, "stomach_pixels": 2979, "total_segmented": 5124 } } ``` ## 🐛 Troubleshooting ### ModuleNotFoundError ```bash pip install -r requirements.txt --default-timeout=1000 ``` ### CUDA Out of Memory ```python # Use CPU instead device = torch.device('cpu') # Or reduce batch size batch_size = 4 ``` ### Model Loading Issues ```bash python diagnose.py # Check all requirements ``` ## 📈 Performance Metrics Evaluated on validation set: - **mIoU**: Intersection over Union - **Precision**: Per-class accuracy - **Recall**: Organ detection rate - **F1-Score**: Harmonic mean See [IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md) for details. ## 🤝 Contributing Contributions welcome! Areas for improvement: - [ ] Add more organ classes - [ ] Improve inference speed - [ ] Add DICOM format support - [ ] Deploy to Hugging Face Spaces - [ ] Add multi-modal support (CT/MRI) ## 📚 References - [UW-Madison GI Tract Dataset](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation) - [SegFormer Paper](https://arxiv.org/abs/2105.15203) - [HuggingFace Transformers](https://huggingface.co/docs/transformers) ## 📝 License This project is licensed under the MIT License - see [LICENSE](LICENSE) file for details. ## 👥 Citation If you use this project, please cite: ```bibtex @software{medical_image_seg_2026, title={Medical Image Segmentation - UW-Madison GI Tract}, author={Hungkm}, year={2026}, url={https://github.com/hung2903/medical-image-segmentation} } ``` ## 📧 Contact For questions or issues: - Open a GitHub issue - Email: kmh2903.dsh@gmail.com --- **Made with ❤️ for medical imaging**