Spaces:

hung2903
/

Medical_Image_Segmentation

Sleeping

App Files Files Community

Medical_Image_Segmentation / README.md

AntiGravity Bot

Update: Medical Image Segmentation (2026-01-27 11:06)

a8cde4b 12 days ago

preview code

raw

history blame contribute delete

9.42 kB

	---
	title: Medical Image Segmentation - GI Tract
	emoji: 🏥
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	sdk_version: "4.11.0"
	python_version: "3.9"
	app_file: app.py
	pinned: false
	---

	# 🏥 Medical Image Segmentation - UW-Madison GI Tract

	![License](https://img.shields.io/badge/license-MIT-blue.svg)
	![Python](https://img.shields.io/badge/python-3.8+-green.svg)
	![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)
	![Status](https://img.shields.io/badge/status-production--ready-success.svg)

	> Automated semantic segmentation of gastrointestinal tract organs in medical CT/MRI images using SegFormer and Gradio web interface.

	## 📋 Table of Contents
	- [Overview](#overview)
	- [Features](#features)
	- [Installation](#installation)
	- [Quick Start](#quick-start)
	- [Usage](#usage)
	- [Project Structure](#project-structure)
	- [Model Details](#model-details)
	- [Training](#training)
	- [API Reference](#api-reference)
	- [Contributing](#contributing)
	- [License](#license)

	## 📊 Overview

	This project provides an end-to-end solution for segmenting GI tract organs in medical images:
	- Stomach
	- Large Bowel
	- Small Bowel

	Built with state-of-the-art SegFormer architecture and trained on the UW-Madison GI Tract Image Segmentation dataset (45K+ images).

	### Key Achievements
	- ✅ 64M parameter efficient model
	- ✅ Interactive Gradio web interface
	- ✅ Real-time inference on CPU/GPU
	- ✅ 40+ pre-loaded sample images
	- ✅ Complete training pipeline included
	- ✅ Production-ready code

	## ✨ Features

	### Core Capabilities
	- Web Interface: Upload images and get instant segmentation predictions
	- Batch Processing: Test on multiple images simultaneously
	- Color-Coded Output: Intuitive visual representation of organ locations
	- Confidence Scores: Pixel-level confidence metrics for each organ
	- Interactive Notebook: Educational Jupyter notebook with step-by-step examples

	### Development Tools
	- Data download automation (Kaggle integration)
	- Dataset preparation and preprocessing
	- Model training with validation
	- Comprehensive evaluation metrics
	- Diagnostic system checker
	- Simple testing without ground truth

	## 🚀 Installation

	### Requirements
	- Python 3.8 or higher
	- CUDA 11.8+ (optional, for GPU acceleration)
	- 4GB RAM minimum (8GB recommended)
	- 2GB disk space

	### Step 1: Clone Repository
	```bash
	git clone https://github.com/hung2903/medical-image-segmentation.git
	cd UWMGI_Medical_Image_Segmentation
	```

	### Step 2: Create Virtual Environment
	```bash
	# Using venv
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate

	# Or using conda
	conda create -n medseg python=3.10
	conda activate medseg
	```

	### Step 3: Install Dependencies
	```bash
	pip install -r requirements.txt
	```

	### Step 4: Verify Installation
	```bash
	python diagnose.py
	```

	All checks should show ✅ PASSED.

	## 🎯 Quick Start

	### 1. Run Web Interface (Easiest)
	```bash
	python app.py
	```
	Then open http://127.0.0.1:7860 in your browser.

	### 2. Test on Sample Images
	```bash
	python test_simple.py \
	--model segformer_trained_weights \
	--images samples \
	--output-dir results
	```

	### 3. Interactive Jupyter Notebook
	```bash
	jupyter notebook demo.ipynb
	```

	## 📖 Usage

	### Web Interface
	1. Launch: `python app.py`
	2. Upload medical image (PNG/JPG)
	3. Click "Generate Predictions"
	4. View color-coded segmentation with confidence scores
	5. Download result image

	Supported Formats: PNG, JPG, JPEG, GIF, BMP, WEBP

	### Command Line
	```python
	from app import get_model, predict
	import torch
	from PIL import Image

	# Load model
	device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
	model = get_model(device)

	# Load image
	image = Image.open('sample.png')

	# Get predictions
	output_image, confidence_info = predict(image)
	```

	### Python API
	```python
	import torch
	from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor

	device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
	model = SegformerForSemanticSegmentation.from_pretrained(
	'segformer_trained_weights'
	).to(device)
	processor = SegformerImageProcessor()

	# Process image
	image_input = processor(image, return_tensors='pt').to(device)
	outputs = model(**image_input)
	logits = outputs.logits
	```

	## 📁 Project Structure

	```
	.
	├── app.py # Gradio web interface
	├── train.py # Model training script
	├── test.py # Comprehensive evaluation
	├── test_simple.py # Simple testing without ground truth
	├── download_dataset.py # Kaggle dataset download
	├── prepare_dataset.py # Data preprocessing
	├── diagnose.py # System diagnostics
	├── demo.ipynb # Interactive notebook
	├── requirements.txt # Python dependencies
	├── LICENSE # MIT License
	├── README.md # This file
	├── TRAINING_GUIDE.md # Detailed training instructions
	├── IMPLEMENTATION_SUMMARY.md # Technical details
	├── FILE_INDEX.md # File navigation guide
	├── samples/ # 40 pre-loaded sample images
	├── segformer_trained_weights/ # Pre-trained model
	│ ├── config.json
	│ └── pytorch_model.bin
	└── test_results_simple/ # Test outputs
	```

	## 🧠 Model Details

	### Architecture
	- Model: SegFormer-B0
	- Framework: HuggingFace Transformers
	- Pre-training: Cityscapes dataset
	- Fine-tuning: UW-Madison GI Tract Dataset

	### Specifications
	\| Aspect \| Value \|
	\|--------\|-------\|
	\| Input Size \| 288 × 288 pixels \|
	\| Output Classes \| 4 (background + 3 organs) \|
	\| Parameters \| 64M \|
	\| Model Size \| 256 MB \|
	\| Inference Time \| ~500ms (CPU), ~100ms (GPU) \|

	### Normalization
	```
	Mean: [0.485, 0.456, 0.406]
	Std: [0.229, 0.224, 0.225]
	```
	(ImageNet standard)

	## 🎓 Training

	### Download Full Dataset
	```bash
	# Requires Kaggle API key setup
	python download_dataset.py
	```

	### Prepare Data
	```bash
	python prepare_dataset.py \
	--data-dir /path/to/downloaded/data \
	--output-dir prepared_data
	```

	### Train Model
	```bash
	python train.py \
	--epochs 20 \
	--batch-size 16 \
	--learning-rate 1e-4 \
	--train-dir prepared_data/train_images \
	--val-dir prepared_data/val_images
	```

	### Evaluate
	```bash
	python test.py \
	--model models/best_model \
	--test-images prepared_data/test_images \
	--test-masks prepared_data/test_masks \
	--visualize
	```

	See [TRAINING_GUIDE.md](TRAINING_GUIDE.md) for detailed instructions.

	## 📡 API Reference

	### app.py
	```python
	def predict(image: Image.Image) -> Tuple[Image.Image, str]:
	"""Perform segmentation on input image."""

	def get_model(device: torch.device) -> SegformerForSemanticSegmentation:
	"""Load pre-trained model."""
	```

	### test_simple.py
	```python
	class SimpleSegmentationTester:
	def test_batch(self, image_paths: List[str]) -> Dict:
	"""Segment multiple images."""
	```

	### train.py
	```python
	class MedicalImageSegmentationTrainer:
	def train(self, num_epochs: int) -> None:
	"""Train model with validation."""
	```

	## 🔄 Preprocessing Pipeline

	1. Image Resize: 288 × 288
	2. Normalization: ImageNet standard (mean/std)
	3. Tensor Conversion: Convert to PyTorch tensors
	4. Device Transfer: Move to GPU/CPU

	## 📊 Output Format

	### Web Interface
	- Colored overlay image (red/green/blue for organs)
	- Confidence percentages per organ
	- Downloadable result image

	### JSON Output (test_simple.py)
	```json
	{
	"case101_day26": {
	"large_bowel_pixels": 244,
	"small_bowel_pixels": 1901,
	"stomach_pixels": 2979,
	"total_segmented": 5124
	}
	}
	```

	## 🐛 Troubleshooting

	### ModuleNotFoundError
	```bash
	pip install -r requirements.txt --default-timeout=1000
	```

	### CUDA Out of Memory
	```python
	# Use CPU instead
	device = torch.device('cpu')

	# Or reduce batch size
	batch_size = 4
	```

	### Model Loading Issues
	```bash
	python diagnose.py # Check all requirements
	```

	## 📈 Performance Metrics

	Evaluated on validation set:
	- mIoU: Intersection over Union
	- Precision: Per-class accuracy
	- Recall: Organ detection rate
	- F1-Score: Harmonic mean

	See [IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md) for details.

	## 🤝 Contributing

	Contributions welcome! Areas for improvement:
	- [ ] Add more organ classes
	- [ ] Improve inference speed
	- [ ] Add DICOM format support
	- [ ] Deploy to Hugging Face Spaces
	- [ ] Add multi-modal support (CT/MRI)

	## 📚 References

	- [UW-Madison GI Tract Dataset](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation)
	- [SegFormer Paper](https://arxiv.org/abs/2105.15203)
	- [HuggingFace Transformers](https://huggingface.co/docs/transformers)

	## 📝 License

	This project is licensed under the MIT License - see [LICENSE](LICENSE) file for details.

	## 👥 Citation

	If you use this project, please cite:
	```bibtex
	@software{medical_image_seg_2026,
	title={Medical Image Segmentation - UW-Madison GI Tract},
	author={Hungkm},
	year={2026},
	url={https://github.com/hung2903/medical-image-segmentation}
	}
	```

	## 📧 Contact

	For questions or issues:
	- Open a GitHub issue
	- Email: kmh2903.dsh@gmail.com

	---

	Made with ❤️ for medical imaging