---
title: Medical Image Segmentation - GI Tract
emoji: 🏥
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "4.11.0"
python_version: "3.9"
app_file: app.py
pinned: false
---

# 🏥 Medical Image Segmentation - UW-Madison GI Tract

![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.8+-green.svg)
![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)
![Status](https://img.shields.io/badge/status-production--ready-success.svg)

> Automated semantic segmentation of gastrointestinal tract organs in medical CT/MRI images using SegFormer and Gradio web interface.

## 📋 Table of Contents
- [Overview](#overview)
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Model Details](#model-details)
- [Training](#training)
- [API Reference](#api-reference)
- [Contributing](#contributing)
- [License](#license)

## 📊 Overview

This project provides an end-to-end solution for segmenting GI tract organs in medical images:
- **Stomach**
- **Large Bowel**
- **Small Bowel**

Built with state-of-the-art SegFormer architecture and trained on the UW-Madison GI Tract Image Segmentation dataset (45K+ images).

### Key Achievements
- ✅ 64M parameter efficient model
- ✅ Interactive Gradio web interface  
- ✅ Real-time inference on CPU/GPU
- ✅ 40+ pre-loaded sample images
- ✅ Complete training pipeline included
- ✅ Production-ready code

## ✨ Features

### Core Capabilities
- **Web Interface**: Upload images and get instant segmentation predictions
- **Batch Processing**: Test on multiple images simultaneously
- **Color-Coded Output**: Intuitive visual representation of organ locations
- **Confidence Scores**: Pixel-level confidence metrics for each organ
- **Interactive Notebook**: Educational Jupyter notebook with step-by-step examples

### Development Tools
- Data download automation (Kaggle integration)
- Dataset preparation and preprocessing
- Model training with validation
- Comprehensive evaluation metrics
- Diagnostic system checker
- Simple testing without ground truth

## 🚀 Installation

### Requirements
- Python 3.8 or higher
- CUDA 11.8+ (optional, for GPU acceleration)
- 4GB RAM minimum (8GB recommended)
- 2GB disk space

### Step 1: Clone Repository
```bash
git clone https://github.com/hung2903/medical-image-segmentation.git
cd UWMGI_Medical_Image_Segmentation
```

### Step 2: Create Virtual Environment
```bash
# Using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Or using conda
conda create -n medseg python=3.10
conda activate medseg
```

### Step 3: Install Dependencies
```bash
pip install -r requirements.txt
```

### Step 4: Verify Installation
```bash
python diagnose.py
```

All checks should show ✅ PASSED.

## 🎯 Quick Start

### 1. Run Web Interface (Easiest)
```bash
python app.py
```
Then open http://127.0.0.1:7860 in your browser.

### 2. Test on Sample Images
```bash
python test_simple.py \
    --model segformer_trained_weights \
    --images samples \
    --output-dir results
```

### 3. Interactive Jupyter Notebook
```bash
jupyter notebook demo.ipynb
```

## 📖 Usage

### Web Interface
1. Launch: `python app.py`
2. Upload medical image (PNG/JPG)
3. Click "Generate Predictions"
4. View color-coded segmentation with confidence scores
5. Download result image

**Supported Formats**: PNG, JPG, JPEG, GIF, BMP, WEBP

### Command Line
```python
from app import get_model, predict
import torch
from PIL import Image

# Load model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = get_model(device)

# Load image
image = Image.open('sample.png')

# Get predictions
output_image, confidence_info = predict(image)
```

### Python API
```python
import torch
from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = SegformerForSemanticSegmentation.from_pretrained(
    'segformer_trained_weights'
).to(device)
processor = SegformerImageProcessor()

# Process image
image_input = processor(image, return_tensors='pt').to(device)
outputs = model(**image_input)
logits = outputs.logits
```

## 📁 Project Structure

```
.
├── app.py                      # Gradio web interface
├── train.py                    # Model training script
├── test.py                     # Comprehensive evaluation
├── test_simple.py              # Simple testing without ground truth
├── download_dataset.py         # Kaggle dataset download
├── prepare_dataset.py          # Data preprocessing
├── diagnose.py                 # System diagnostics
├── demo.ipynb                  # Interactive notebook
├── requirements.txt            # Python dependencies
├── LICENSE                     # MIT License
├── README.md                   # This file
├── TRAINING_GUIDE.md           # Detailed training instructions
├── IMPLEMENTATION_SUMMARY.md   # Technical details
├── FILE_INDEX.md               # File navigation guide
├── samples/                    # 40 pre-loaded sample images
├── segformer_trained_weights/  # Pre-trained model
│   ├── config.json
│   └── pytorch_model.bin
└── test_results_simple/        # Test outputs
```

## 🧠 Model Details

### Architecture
- **Model**: SegFormer-B0
- **Framework**: HuggingFace Transformers
- **Pre-training**: Cityscapes dataset
- **Fine-tuning**: UW-Madison GI Tract Dataset

### Specifications
| Aspect | Value |
|--------|-------|
| Input Size | 288 × 288 pixels |
| Output Classes | 4 (background + 3 organs) |
| Parameters | 64M |
| Model Size | 256 MB |
| Inference Time | ~500ms (CPU), ~100ms (GPU) |

### Normalization
```
Mean: [0.485, 0.456, 0.406]
Std:  [0.229, 0.224, 0.225]
```
(ImageNet standard)

## 🎓 Training

### Download Full Dataset
```bash
# Requires Kaggle API key setup
python download_dataset.py
```

### Prepare Data
```bash
python prepare_dataset.py \
    --data-dir /path/to/downloaded/data \
    --output-dir prepared_data
```

### Train Model
```bash
python train.py \
    --epochs 20 \
    --batch-size 16 \
    --learning-rate 1e-4 \
    --train-dir prepared_data/train_images \
    --val-dir prepared_data/val_images
```

### Evaluate
```bash
python test.py \
    --model models/best_model \
    --test-images prepared_data/test_images \
    --test-masks prepared_data/test_masks \
    --visualize
```

See [TRAINING_GUIDE.md](TRAINING_GUIDE.md) for detailed instructions.

## 📡 API Reference

### app.py
```python
def predict(image: Image.Image) -> Tuple[Image.Image, str]:
    """Perform segmentation on input image."""
    
def get_model(device: torch.device) -> SegformerForSemanticSegmentation:
    """Load pre-trained model."""
```

### test_simple.py
```python
class SimpleSegmentationTester:
    def test_batch(self, image_paths: List[str]) -> Dict:
        """Segment multiple images."""
```

### train.py
```python
class MedicalImageSegmentationTrainer:
    def train(self, num_epochs: int) -> None:
        """Train model with validation."""
```

## 🔄 Preprocessing Pipeline

1. **Image Resize**: 288 × 288
2. **Normalization**: ImageNet standard (mean/std)
3. **Tensor Conversion**: Convert to PyTorch tensors
4. **Device Transfer**: Move to GPU/CPU

## 📊 Output Format

### Web Interface
- Colored overlay image (red/green/blue for organs)
- Confidence percentages per organ
- Downloadable result image

### JSON Output (test_simple.py)
```json
{
  "case101_day26": {
    "large_bowel_pixels": 244,
    "small_bowel_pixels": 1901,
    "stomach_pixels": 2979,
    "total_segmented": 5124
  }
}
```

## 🐛 Troubleshooting

### ModuleNotFoundError
```bash
pip install -r requirements.txt --default-timeout=1000
```

### CUDA Out of Memory
```python
# Use CPU instead
device = torch.device('cpu')

# Or reduce batch size
batch_size = 4
```

### Model Loading Issues
```bash
python diagnose.py  # Check all requirements
```

## 📈 Performance Metrics

Evaluated on validation set:
- **mIoU**: Intersection over Union
- **Precision**: Per-class accuracy
- **Recall**: Organ detection rate
- **F1-Score**: Harmonic mean

See [IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md) for details.

## 🤝 Contributing

Contributions welcome! Areas for improvement:
- [ ] Add more organ classes
- [ ] Improve inference speed
- [ ] Add DICOM format support
- [ ] Deploy to Hugging Face Spaces
- [ ] Add multi-modal support (CT/MRI)

## 📚 References

- [UW-Madison GI Tract Dataset](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation)
- [SegFormer Paper](https://arxiv.org/abs/2105.15203)
- [HuggingFace Transformers](https://huggingface.co/docs/transformers)

## 📝 License

This project is licensed under the MIT License - see [LICENSE](LICENSE) file for details.

## 👥 Citation

If you use this project, please cite:
```bibtex
@software{medical_image_seg_2026,
  title={Medical Image Segmentation - UW-Madison GI Tract},
  author={Hungkm},
  year={2026},
  url={https://github.com/hung2903/medical-image-segmentation}
}
```

## 📧 Contact

For questions or issues:
- Open a GitHub issue
- Email: kmh2903.dsh@gmail.com

---

**Made with ❤️ for medical imaging**