File size: 9,420 Bytes
8f59aab
 
 
 
 
 
a8cde4b
8f59aab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
---
title: Medical Image Segmentation - GI Tract
emoji: 🏥
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "4.11.0"
python_version: "3.9"
app_file: app.py
pinned: false
---

# 🏥 Medical Image Segmentation - UW-Madison GI Tract

![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.8+-green.svg)
![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)
![Status](https://img.shields.io/badge/status-production--ready-success.svg)

> Automated semantic segmentation of gastrointestinal tract organs in medical CT/MRI images using SegFormer and Gradio web interface.

## 📋 Table of Contents
- [Overview](#overview)
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Model Details](#model-details)
- [Training](#training)
- [API Reference](#api-reference)
- [Contributing](#contributing)
- [License](#license)

## 📊 Overview

This project provides an end-to-end solution for segmenting GI tract organs in medical images:
- **Stomach**
- **Large Bowel**
- **Small Bowel**

Built with state-of-the-art SegFormer architecture and trained on the UW-Madison GI Tract Image Segmentation dataset (45K+ images).

### Key Achievements
- ✅ 64M parameter efficient model
- ✅ Interactive Gradio web interface  
- ✅ Real-time inference on CPU/GPU
- ✅ 40+ pre-loaded sample images
- ✅ Complete training pipeline included
- ✅ Production-ready code

## ✨ Features

### Core Capabilities
- **Web Interface**: Upload images and get instant segmentation predictions
- **Batch Processing**: Test on multiple images simultaneously
- **Color-Coded Output**: Intuitive visual representation of organ locations
- **Confidence Scores**: Pixel-level confidence metrics for each organ
- **Interactive Notebook**: Educational Jupyter notebook with step-by-step examples

### Development Tools
- Data download automation (Kaggle integration)
- Dataset preparation and preprocessing
- Model training with validation
- Comprehensive evaluation metrics
- Diagnostic system checker
- Simple testing without ground truth

## 🚀 Installation

### Requirements
- Python 3.8 or higher
- CUDA 11.8+ (optional, for GPU acceleration)
- 4GB RAM minimum (8GB recommended)
- 2GB disk space

### Step 1: Clone Repository
```bash
git clone https://github.com/hung2903/medical-image-segmentation.git
cd UWMGI_Medical_Image_Segmentation
```

### Step 2: Create Virtual Environment
```bash
# Using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Or using conda
conda create -n medseg python=3.10
conda activate medseg
```

### Step 3: Install Dependencies
```bash
pip install -r requirements.txt
```

### Step 4: Verify Installation
```bash
python diagnose.py
```

All checks should show ✅ PASSED.

## 🎯 Quick Start

### 1. Run Web Interface (Easiest)
```bash
python app.py
```
Then open http://127.0.0.1:7860 in your browser.

### 2. Test on Sample Images
```bash
python test_simple.py \
    --model segformer_trained_weights \
    --images samples \
    --output-dir results
```

### 3. Interactive Jupyter Notebook
```bash
jupyter notebook demo.ipynb
```

## 📖 Usage

### Web Interface
1. Launch: `python app.py`
2. Upload medical image (PNG/JPG)
3. Click "Generate Predictions"
4. View color-coded segmentation with confidence scores
5. Download result image

**Supported Formats**: PNG, JPG, JPEG, GIF, BMP, WEBP

### Command Line
```python
from app import get_model, predict
import torch
from PIL import Image

# Load model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = get_model(device)

# Load image
image = Image.open('sample.png')

# Get predictions
output_image, confidence_info = predict(image)
```

### Python API
```python
import torch
from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = SegformerForSemanticSegmentation.from_pretrained(
    'segformer_trained_weights'
).to(device)
processor = SegformerImageProcessor()

# Process image
image_input = processor(image, return_tensors='pt').to(device)
outputs = model(**image_input)
logits = outputs.logits
```

## 📁 Project Structure

```
.
├── app.py                      # Gradio web interface
├── train.py                    # Model training script
├── test.py                     # Comprehensive evaluation
├── test_simple.py              # Simple testing without ground truth
├── download_dataset.py         # Kaggle dataset download
├── prepare_dataset.py          # Data preprocessing
├── diagnose.py                 # System diagnostics
├── demo.ipynb                  # Interactive notebook
├── requirements.txt            # Python dependencies
├── LICENSE                     # MIT License
├── README.md                   # This file
├── TRAINING_GUIDE.md           # Detailed training instructions
├── IMPLEMENTATION_SUMMARY.md   # Technical details
├── FILE_INDEX.md               # File navigation guide
├── samples/                    # 40 pre-loaded sample images
├── segformer_trained_weights/  # Pre-trained model
│   ├── config.json
│   └── pytorch_model.bin
└── test_results_simple/        # Test outputs
```

## 🧠 Model Details

### Architecture
- **Model**: SegFormer-B0
- **Framework**: HuggingFace Transformers
- **Pre-training**: Cityscapes dataset
- **Fine-tuning**: UW-Madison GI Tract Dataset

### Specifications
| Aspect | Value |
|--------|-------|
| Input Size | 288 × 288 pixels |
| Output Classes | 4 (background + 3 organs) |
| Parameters | 64M |
| Model Size | 256 MB |
| Inference Time | ~500ms (CPU), ~100ms (GPU) |

### Normalization
```
Mean: [0.485, 0.456, 0.406]
Std:  [0.229, 0.224, 0.225]
```
(ImageNet standard)

## 🎓 Training

### Download Full Dataset
```bash
# Requires Kaggle API key setup
python download_dataset.py
```

### Prepare Data
```bash
python prepare_dataset.py \
    --data-dir /path/to/downloaded/data \
    --output-dir prepared_data
```

### Train Model
```bash
python train.py \
    --epochs 20 \
    --batch-size 16 \
    --learning-rate 1e-4 \
    --train-dir prepared_data/train_images \
    --val-dir prepared_data/val_images
```

### Evaluate
```bash
python test.py \
    --model models/best_model \
    --test-images prepared_data/test_images \
    --test-masks prepared_data/test_masks \
    --visualize
```

See [TRAINING_GUIDE.md](TRAINING_GUIDE.md) for detailed instructions.

## 📡 API Reference

### app.py
```python
def predict(image: Image.Image) -> Tuple[Image.Image, str]:
    """Perform segmentation on input image."""
    
def get_model(device: torch.device) -> SegformerForSemanticSegmentation:
    """Load pre-trained model."""
```

### test_simple.py
```python
class SimpleSegmentationTester:
    def test_batch(self, image_paths: List[str]) -> Dict:
        """Segment multiple images."""
```

### train.py
```python
class MedicalImageSegmentationTrainer:
    def train(self, num_epochs: int) -> None:
        """Train model with validation."""
```

## 🔄 Preprocessing Pipeline

1. **Image Resize**: 288 × 288
2. **Normalization**: ImageNet standard (mean/std)
3. **Tensor Conversion**: Convert to PyTorch tensors
4. **Device Transfer**: Move to GPU/CPU

## 📊 Output Format

### Web Interface
- Colored overlay image (red/green/blue for organs)
- Confidence percentages per organ
- Downloadable result image

### JSON Output (test_simple.py)
```json
{
  "case101_day26": {
    "large_bowel_pixels": 244,
    "small_bowel_pixels": 1901,
    "stomach_pixels": 2979,
    "total_segmented": 5124
  }
}
```

## 🐛 Troubleshooting

### ModuleNotFoundError
```bash
pip install -r requirements.txt --default-timeout=1000
```

### CUDA Out of Memory
```python
# Use CPU instead
device = torch.device('cpu')

# Or reduce batch size
batch_size = 4
```

### Model Loading Issues
```bash
python diagnose.py  # Check all requirements
```

## 📈 Performance Metrics

Evaluated on validation set:
- **mIoU**: Intersection over Union
- **Precision**: Per-class accuracy
- **Recall**: Organ detection rate
- **F1-Score**: Harmonic mean

See [IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md) for details.

## 🤝 Contributing

Contributions welcome! Areas for improvement:
- [ ] Add more organ classes
- [ ] Improve inference speed
- [ ] Add DICOM format support
- [ ] Deploy to Hugging Face Spaces
- [ ] Add multi-modal support (CT/MRI)

## 📚 References

- [UW-Madison GI Tract Dataset](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation)
- [SegFormer Paper](https://arxiv.org/abs/2105.15203)
- [HuggingFace Transformers](https://huggingface.co/docs/transformers)

## 📝 License

This project is licensed under the MIT License - see [LICENSE](LICENSE) file for details.

## 👥 Citation

If you use this project, please cite:
```bibtex
@software{medical_image_seg_2026,
  title={Medical Image Segmentation - UW-Madison GI Tract},
  author={Hungkm},
  year={2026},
  url={https://github.com/hung2903/medical-image-segmentation}
}
```

## 📧 Contact

For questions or issues:
- Open a GitHub issue
- Email: kmh2903.dsh@gmail.com

---

**Made with ❤️ for medical imaging**