Update README.md

b05be91 verified 4 months ago

9.68 kB

	---
	license: apache-2.0
	base_model:
	- facebook/sam2-hiera-tiny
	- facebook/sam2-hiera-small
	- facebook/sam2-hiera-base-plus
	- facebook/sam2-hiera-large
	pipeline_tag: image-segmentation
	---
	# Model Card: CoronarySAM2 - Fine-tuned SAM2 for Coronary Artery Segmentation

	## Model Details

	### Model Description

	CoronarySAM2 is a collection of fine-tuned Segment Anything Model 2 (SAM2) variants specifically optimized for coronary artery segmentation in X-ray angiography images. The models use point-based prompting to enable interactive and precise segmentation of coronary arteries from medical imaging data.

	- Developed by: Research Team
	- Model Type: Computer Vision - Image Segmentation
	- Base Architecture: SAM2 (Segment Anything Model 2) with Hiera backbone
	- Language(s): Python
	- License: [Specify License]
	- Fine-tuned from: [facebook/segment-anything-2](https://github.com/facebookresearch/segment-anything-2)

	### Model Variants

	Four model variants are available, offering different trade-offs between speed and accuracy:

	\| Model \| Parameters \| Checkpoint \| Speed \| Accuracy \| Use Case \|
	\|-------\|-----------\|------------\|-------\|----------\|----------\|
	\| SAM2 Hiera Tiny \| ~38M \| `sam2_t/best_model.pt` \| ⚡⚡⚡ Fast \| ⭐⭐⭐ Good \| Quick experiments, real-time feedback \|
	\| SAM2 Hiera Small \| ~46M \| `sam2_s/checkpoint_epoch_70.pt` \| ⚡⚡ Medium \| ⭐⭐⭐⭐ Very Good \| Balanced performance, general use \|
	\| SAM2 Hiera Base Plus \| ~80M \| `sam2_b+/best_model.pt` \| ⚡ Slower \| ⭐⭐⭐⭐⭐ Excellent \| High-quality results, clinical evaluation \|
	\| SAM2 Hiera Large \| ~224M \| `sam2_l/final_model.pt` \| ⚡ Slowest \| ⭐⭐⭐⭐⭐ Best \| Maximum accuracy, research purposes \|

	### Model Architecture

	The models follow the SAM2 architecture with the following components:

	1. Image Encoder: Hiera hierarchical vision transformer backbone
	2. Prompt Encoder: Encodes point prompts (positive/negative) as spatial embeddings
	3. Mask Decoder: Transformer-based decoder that generates high-quality segmentation masks
	4. Preprocessing Pipeline:
	- X-ray image normalization using Gaussian blur
	- CLAHE (Contrast Limited Adaptive Histogram Equalization) for vessel enhancement
	- Fixed resolution resizing to 1024×1024 pixels

	## Intended Use

	### Primary Use Cases

	- Interactive Coronary Artery Segmentation: Point-based annotation for precise artery delineation
	- Medical Image Analysis: Automated assistance for cardiologists and radiologists
	- Research: Computer-aided diagnosis and treatment planning research
	- Educational Purposes: Training and demonstration of medical image segmentation

	### Out-of-Scope Use

	- ❌ Clinical diagnosis without expert oversight
	- ❌ Automated treatment decisions
	- ❌ Real-time interventional guidance without validation
	- ❌ Non-coronary vessel segmentation (not trained for this task)
	- ❌ Modalities other than X-ray angiography (CT, MRI, etc.)

	## Training Data

	### Dataset

	The models were fine-tuned on coronary X-ray angiography images with annotations for coronary artery structures.

	Training Specifications:
	- Modality: X-ray Angiography
	- Target: Coronary Arteries
	- Annotation Type: Binary segmentation masks
	- Resolution: Images resized to 1024×1024 for training

	### Preprocessing

	All training images underwent the following preprocessing pipeline:

	1. Normalization: Gaussian blur-based intensity normalization
	2. CLAHE Enhancement: Adaptive histogram equalization (clip limit: 2.0, tile grid: 8×8)
	3. Resizing: Fixed 1024×1024 resolution
	4. Format: RGB format (grayscale images converted to RGB)

	## Evaluation

	### Metrics

	The models should be evaluated using the following metrics:

	- Dice Coefficient: Measures overlap between predicted and ground truth masks
	- IoU (Intersection over Union): Pixel-wise accuracy metric
	- Precision & Recall: For detecting true vessel pixels
	- Hausdorff Distance: Measures boundary accuracy
	- Inference Time: Speed benchmarks on various hardware

	### Performance Considerations

	- Point Prompt Quality: Model performance heavily depends on the quality and number of point prompts
	- Image Quality: Better results with high-contrast angiography images
	- Vessel Complexity: Performance may vary with vessel overlap and bifurcations
	- Model Selection: Larger models generally provide better accuracy but slower inference

	## How to Use

	### Installation

	```bash
	# Create conda environment
	conda create -n sam2_FT_env python=3.10.0 -y
	conda activate sam2_FT_env

	# Install SAM2
	git clone https://github.com/facebookresearch/segment-anything-2.git
	cd segment-anything-2
	pip install -e .
	cd ..

	# Install dependencies
	pip install gradio opencv-python-headless torch torchvision torchaudio
	```

	### Basic Usage

	```python
	import torch
	import numpy as np
	from sam2.build_sam import build_sam2
	from sam2.sam2_image_predictor import SAM2ImagePredictor

	# Load model
	checkpoint_path = "ft_models/sam2_s/checkpoint_epoch_70.pt"
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	checkpoint = torch.load(checkpoint_path, map_location=device)
	model_cfg = checkpoint['model_cfg']
	sam2_model = build_sam2(model_cfg, checkpoint_path=None, device=device)

	# Load state dict
	state_dict = checkpoint['model_state_dict']
	new_state_dict = {k[7:] if k.startswith('module.') else k: v
	for k, v in state_dict.items()}
	sam2_model.load_state_dict(new_state_dict)
	sam2_model.eval()

	# Create predictor
	predictor = SAM2ImagePredictor(sam2_model)

	# Set image (preprocessed, 1024x1024, RGB, uint8)
	predictor.set_image(preprocessed_image)

	# Add point prompts
	point_coords = np.array([[512, 300], [520, 310]]) # x, y coordinates
	point_labels = np.array([1, 1]) # 1 = positive, 0 = negative

	# Predict
	masks, scores, logits = predictor.predict(
	point_coords=point_coords,
	point_labels=point_labels,
	multimask_output=True
	)
	```

	### Interactive Application

	Launch the Gradio interface:

	```bash
	python app.py
	```

	Access at `http://127.0.0.1:7860`

	## Limitations

	### Technical Limitations

	- Fixed Input Size: Models expect 1024×1024 input (automatic resizing may affect small vessels)
	- Memory Requirements: Large model requires significant GPU memory (~8GB VRAM recommended)
	- Point Dependency: Requires manual point prompts; not fully automatic
	- Single Modality: Optimized only for X-ray angiography

	### Medical Limitations

	- Not FDA Approved: Not cleared for clinical diagnostic use
	- Requires Expert Review: All outputs must be validated by qualified professionals
	- Variability: Performance may vary across different imaging protocols and equipment
	- Edge Cases: May struggle with severe vessel overlap, calcifications, or poor image quality

	### Known Issues

	- High-contrast regions may cause over-segmentation
	- Thin vessel branches may be missed without precise point placement
	- Performance degradation on low-quality or motion-blurred images

	## Ethical Considerations

	### Medical AI Responsibility

	- Human Oversight Required: This tool is designed to assist, not replace, medical professionals
	- No Autonomous Decisions: Should never be used for automated clinical decisions
	- Training Data Bias: Model performance may reflect biases present in training data
	- Privacy: Ensure patient data is handled according to HIPAA/GDPR regulations

	### Fairness & Bias

	- Model performance across different patient demographics should be validated
	- Imaging equipment and protocols may affect performance
	- Consider potential biases in training dataset composition

	### Transparency

	- Model predictions should be explainable to medical professionals
	- Segmentation boundaries should be reviewable and editable
	- Point prompt influence on outputs should be clear to users

	## Citation

	### Base Model (SAM2)

	```bibtex
	@article{ravi2024sam2,
	title={SAM 2: Segment Anything in Images and Videos},
	author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and others},
	journal={arXiv preprint arXiv:2408.00714},
	year={2024}
	}
	```

	### This Work

	If you use CoronarySAM2 in your research, please cite:

	```bibtex
	@software{coronarysam2_2025,
	title={CoronarySAM2: Fine-tuned SAM2 for Coronary Artery Segmentation},
	author={[Your Name/Team]},
	year={2025},
	url={[Repository URL]}
	}
	```

	## Model Card Authors

	- [Primary Author Names]
	- Last Updated: November 2025

	## Contact

	For questions, issues, or collaboration inquiries:

	- GitHub Issues: [Repository URL]/issues
	- Email: [Contact Email]

	## Disclaimer

	⚠️ IMPORTANT MEDICAL DISCLAIMER ⚠️

	This software is provided for research and educational purposes only. It is not intended for clinical use, medical diagnosis, or treatment planning. The models have not been validated for clinical deployment and are not FDA-approved or CE-marked medical devices.

	Always consult qualified healthcare professionals for medical image interpretation and clinical decisions. The developers assume no liability for any clinical use or consequences resulting from the use of this software.

	## Additional Resources

	- [SAM2 Paper](https://arxiv.org/abs/2408.00714)
	- [SAM2 GitHub Repository](https://github.com/facebookresearch/segment-anything-2)
	- [Project README](README.md)
	- [Application Interface](app.py)

	---

	Version: 1.0
	Last Updated: November 18, 2025
	Status: Research/Development