adelelsayed1991

Update README.md

0bf1e3d verified 30 days ago

10.3 kB

	---
	language:
	- en
	tags:
	- medical-imaging
	- chest-xray
	- chexpert
	- multi-label-classification
	- mae
	- densenet
	- fpn
	- deep-learning
	- healthcare
	license: mit
	datasets:
	- chexpert
	library_name: pytorch
	pipeline_tag: image-classification
	model_name: CheXpert MAE–DenseNet–FPN
	model_type: hybrid transformer-cnn
	metrics:
	- roc_auc
	---

	# CheXpert MAE-DenseNet-FPN

	A deep learning framework for multi-label chest X-ray classification using a hybrid architecture combining Masked Autoencoders (MAE), DenseNet with CBAM attention, and Feature Pyramid Networks (FPN) with bidirectional cross-attention fusion.

	## 🏗️ Architecture Overview

	This project implements a novel multi-modal fusion architecture for medical image classification:

	- MAE Encoder: Vision Transformer-based masked autoencoder for self-supervised feature extraction
	- DenseNet-169: Dense convolutional network with Channel and Spatial Attention (CBAM)
	- Feature Pyramid Network: Multi-scale feature extraction at 4 different resolutions
	- Bidirectional Cross-Attention: Fusion mechanism allowing MAE and DenseNet features to attend to each other
	- Learned Logit Ensemble: Intelligent combination of 7 prediction heads with learnable temperature scaling

	### Key Components

	```
	Input Image (384×384)
	│
	├─────────────────────────────┐
	│ │
	▼ ▼
	MAE Encoder DenseNet-169
	(ViT-based) (with CBAM)
	│ │
	│ ┌───────────────────┤
	│ │ │
	│ FPN Pyramid Dense Features
	│ (P1-P4) (Multi-scale)
	│ │ │
	└─────────┴───────────────────┘
	│
	Bidirectional Cross-Attention
	│
	┌─────────┴──────────┐
	│ │
	MAE Head Dense Head + 4 FPN Heads
	│ │
	└────────┬───────────┘
	│
	Learned Ensemble (7 heads)
	│
	▼
	14-class Predictions
	```

	## ✨ Features

	- Hybrid Architecture: Combines transformer-based and convolutional approaches
	- Multi-scale Learning: FPN extracts features at 4 different resolutions
	- Advanced Fusion: Bidirectional cross-attention between MAE and DenseNet features
	- Optimized Training:
	- Mixed precision training (FP16)
	- Gradient accumulation
	- Weighted sampling for class imbalance
	- Cosine annealing with linear warmup
	- Gradient checkpointing for memory efficiency
	- Smart Data Loading:
	- ZIP file reader with LRU caching
	- On-the-fly augmentation using Albumentations
	- Multi-worker data loading with persistent workers
	- Comprehensive Evaluation:
	- Per-class AUC metrics
	- Optimal threshold computation per class
	- Macro and Micro AUC tracking

	## 📋 Requirements

	- Python 3.8+
	- CUDA-capable GPU (recommended: 16GB+ VRAM)
	- CheXpert dataset

	## 🚀 Installation

	1. Clone the repository
	```bash
	git clone https://github.com/adelelsayed/chexpert-mae-densenet-fpn.git
	cd chexpert-mae-densenet-fpn
	```

	2. Create a virtual environment
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	3. Install dependencies
	```bash
	pip install -r requirements.txt
	```

	## 📊 Dataset Setup

	1. Download CheXpert Dataset
	- Visit: https://stanfordmlgroup.github.io/competitions/chexpert/
	- Download CheXpert-v1.0-small

	2. Prepare the dataset
	```bash
	# Extract the dataset
	unzip CheXpert-v1.0-small.zip

	# Optionally, create a ZIP archive for faster loading
	cd CheXpert-v1.0-small
	zip -r chexpert.zip train/ valid/
	```

	3. Update configuration
	- Edit `configs/configs.py`
	- Update `root` variable to point to your dataset location
	- Update all paths accordingly

	## 🔧 Configuration

	Edit `configs/configs.py` to customize:

	```python
	# Example: Update paths
	root = "/path/to/your/data"

	mae_config = {
	"lr": 1e-4,
	"num_epochs": 200,
	"batch_size": 96,
	# ... other parameters
	}

	config = {
	"lr": 1e-4,
	"num_epochs": 200,
	"batch_size": 36,
	# ... other parameters
	}
	```

	## 🎯 Training

	### Phase 1: Pre-train MAE

	```bash
	python trainer/trainer.py
	# When prompted, type: mae
	```

	The MAE pre-training learns robust feature representations through masked image reconstruction.

	### Phase 2: Train Classifier

	```bash
	python trainer/trainer.py
	# When prompted, type: classifier
	```

	This loads the pre-trained MAE encoder and trains the full classification pipeline.

	### Training Configuration

	- MAE Training:
	- Batch size: 96
	- Mask ratio: 0.75 (masks 75% of patches)
	- Reconstruction loss on masked patches

	- Classifier Training:
	- Batch size: 36 with gradient accumulation (8 steps)
	- Effective batch size: 288
	- Asymmetric loss with class weights
	- Per-class threshold optimization

	## 🧪 Testing

	```python
	from trainer.utils import Trainer
	from configs.configs import config

	# Initialize trainer
	trainer = Trainer(config)

	# Run evaluation on test set
	macro_auc, micro_auc, per_class = trainer.test(
	model_path="path/to/checkpoint.pth"
	)

	print(f"Macro AUC: {macro_auc:.4f}")
	print(f"Micro AUC: {micro_auc:.4f}")
	```

	## 📁 Project Structure

	```
	chexpert-mae-densenet-fpn/
	├── configs/
	│ ├── __init__.py
	│ └── configs.py # Configuration parameters
	├── data/
	│ ├── __init__.py
	│ ├── dataset.py # CheXpert dataset with ZIP caching
	│ └── splitter.py # Data splitting utilities
	├── loss/
	│ ├── __init__.py
	│ └── assymetric.py # Asymmetric loss for imbalanced data
	├── models/
	│ ├── __init__.py
	│ ├── mae.py # Masked Autoencoder implementation
	│ ├── densenet.py # DenseNet-169 with CBAM
	│ └── classifier.py # Full classification architecture
	├── trainer/
	│ ├── __init__.py
	│ ├── trainer.py # Main training script
	│ ├── utils.py # Training utilities and loops
	│ └── test.py # Testing utilities
	├── notebooks/
	│ ├── chexpert_mae.ipynb # MAE experiments
	│ └── chexpert_mae_mask_classifier.ipynb # Full pipeline experiments
	├── requirements.txt
	└── README.md
	```

	## 📈 Model Architecture Details

	### MAE Encoder
	- Patch size: 16×16
	- Embedding dim: 768
	- Depth: 12 transformer blocks
	- Heads: 8 attention heads
	- MLP ratio: 4×

	### DenseNet-169
	- Growth rate (k): 64
	- Layers: [6, 12, 24, 16]
	- CBAM: Channel + Spatial attention at each stage
	- Dropout: Progressive (0.05 → 0.1 → 0.1 → 0.1)

	### Cross-Attention Fusion
	- 12 bidirectional cross-attention layers
	- Projection dim: 512
	- Attention heads: 8

	### FPN
	- Feature levels: P1 (192×192), P2 (96×96), P3 (48×48), P4 (24×24)
	- Channel unification: 256 channels per level

	## 🎓 CheXpert Labels

	The model predicts 14 pathologies:

	1. No Finding
	2. Enlarged Cardiomediastinum
	3. Cardiomegaly
	4. Lung Opacity
	5. Lung Lesion
	6. Edema
	7. Consolidation
	8. Pneumonia
	9. Atelectasis
	10. Pneumothorax
	11. Pleural Effusion
	12. Pleural Other
	13. Fracture
	14. Support Devices

	## 🔬 Data Augmentation

	Training augmentations (conservative for medical images):
	- Horizontal flip (p=0.5)
	- Random affine (translation, scale, rotation ±10°)
	- Random brightness/contrast
	- CLAHE histogram equalization
	- Gaussian blur and noise

	## 💾 Checkpoints

	The training automatically saves:
	- Best MAE checkpoint: Based on validation reconstruction loss
	- Best classifier checkpoint: Based on validation AUC (macro/micro)
	- Training history: JSON file with all metrics
	- Per-epoch metrics plots: Loss and AUC curves

	## 📊 Monitoring

	Training logs are saved to:
	- `training_log.txt`: Training progress with live metrics
	- `val_log.txt`: Validation results
	- `test_log.txt`: Test evaluation results
	- `history.json`: All metrics across epochs
	- `metrics.png`: Visualization plots

	## ⚡ Performance Tips

	1. Memory Optimization:
	- Use gradient checkpointing (already enabled)
	- Reduce batch size if OOM occurs
	- Increase gradient accumulation steps

	2. Speed Optimization:
	- Use persistent workers (already enabled)
	- Enable cuDNN benchmark (already enabled)
	- Use ZIP caching for faster data loading

	3. Training Stability:
	- Gradient clipping at norm 1.0
	- Mixed precision with dynamic loss scaling
	- Warmup learning rate schedule

	## 🐛 Troubleshooting

	Q: Out of memory errors?
	- Reduce batch size in configs.py
	- Increase gradient accumulation steps
	- Enable gradient checkpointing

	Q: Slow training?
	- Check if ZIP caching is enabled
	- Verify persistent workers are active
	- Monitor GPU utilization

	Q: Poor convergence?
	- Ensure MAE is properly pre-trained first
	- Check learning rate and warmup settings
	- Verify class weights are computed correctly

	## 📚 Citation

	If you use this code in your research, please cite:

	```bibtex
	@misc{chexpert-mae-densenet-fpn,
	author = {adel elsayed},
	title = {CheXpert Classification with MAE-DenseNet-FPN},
	year = {2025},
	publisher = {GitHub},
	url = {https://github.com/adelelsayed/chexpert-mae-densenet-fpn}
	}
	```

	## 🙏 Acknowledgments

	- CheXpert Dataset: Stanford ML Group
	- Masked Autoencoders: Meta AI Research (He et al., 2021)
	- DenseNet: Huang et al., 2017
	- CBAM: Woo et al., 2018
	- Feature Pyramid Networks: Lin et al., 2017

	## 📄 License

	## License
	This project is licensed under the MIT License.


	## 📧 Contact

	https://www.linkedin.com/in/adel-elsayed-a5260246/

	Note: This is a research project. For clinical use, please ensure proper validation and regulatory approval.