CoronarySAM2 / README.md
astroanand's picture
Update README.md
b05be91 verified
---
license: apache-2.0
base_model:
- facebook/sam2-hiera-tiny
- facebook/sam2-hiera-small
- facebook/sam2-hiera-base-plus
- facebook/sam2-hiera-large
pipeline_tag: image-segmentation
---
# Model Card: CoronarySAM2 - Fine-tuned SAM2 for Coronary Artery Segmentation
## Model Details
### Model Description
CoronarySAM2 is a collection of fine-tuned Segment Anything Model 2 (SAM2) variants specifically optimized for coronary artery segmentation in X-ray angiography images. The models use point-based prompting to enable interactive and precise segmentation of coronary arteries from medical imaging data.
- **Developed by:** Research Team
- **Model Type:** Computer Vision - Image Segmentation
- **Base Architecture:** SAM2 (Segment Anything Model 2) with Hiera backbone
- **Language(s):** Python
- **License:** [Specify License]
- **Fine-tuned from:** [facebook/segment-anything-2](https://github.com/facebookresearch/segment-anything-2)
### Model Variants
Four model variants are available, offering different trade-offs between speed and accuracy:
| Model | Parameters | Checkpoint | Speed | Accuracy | Use Case |
|-------|-----------|------------|-------|----------|----------|
| **SAM2 Hiera Tiny** | ~38M | `sam2_t/best_model.pt` | ⚡⚡⚡ Fast | ⭐⭐⭐ Good | Quick experiments, real-time feedback |
| **SAM2 Hiera Small** | ~46M | `sam2_s/checkpoint_epoch_70.pt` | ⚡⚡ Medium | ⭐⭐⭐⭐ Very Good | Balanced performance, general use |
| **SAM2 Hiera Base Plus** | ~80M | `sam2_b+/best_model.pt` | ⚡ Slower | ⭐⭐⭐⭐⭐ Excellent | High-quality results, clinical evaluation |
| **SAM2 Hiera Large** | ~224M | `sam2_l/final_model.pt` | ⚡ Slowest | ⭐⭐⭐⭐⭐ Best | Maximum accuracy, research purposes |
### Model Architecture
The models follow the SAM2 architecture with the following components:
1. **Image Encoder**: Hiera hierarchical vision transformer backbone
2. **Prompt Encoder**: Encodes point prompts (positive/negative) as spatial embeddings
3. **Mask Decoder**: Transformer-based decoder that generates high-quality segmentation masks
4. **Preprocessing Pipeline**:
- X-ray image normalization using Gaussian blur
- CLAHE (Contrast Limited Adaptive Histogram Equalization) for vessel enhancement
- Fixed resolution resizing to 1024×1024 pixels
## Intended Use
### Primary Use Cases
- **Interactive Coronary Artery Segmentation**: Point-based annotation for precise artery delineation
- **Medical Image Analysis**: Automated assistance for cardiologists and radiologists
- **Research**: Computer-aided diagnosis and treatment planning research
- **Educational Purposes**: Training and demonstration of medical image segmentation
### Out-of-Scope Use
- ❌ Clinical diagnosis without expert oversight
- ❌ Automated treatment decisions
- ❌ Real-time interventional guidance without validation
- ❌ Non-coronary vessel segmentation (not trained for this task)
- ❌ Modalities other than X-ray angiography (CT, MRI, etc.)
## Training Data
### Dataset
The models were fine-tuned on coronary X-ray angiography images with annotations for coronary artery structures.
**Training Specifications:**
- **Modality**: X-ray Angiography
- **Target**: Coronary Arteries
- **Annotation Type**: Binary segmentation masks
- **Resolution**: Images resized to 1024×1024 for training
### Preprocessing
All training images underwent the following preprocessing pipeline:
1. **Normalization**: Gaussian blur-based intensity normalization
2. **CLAHE Enhancement**: Adaptive histogram equalization (clip limit: 2.0, tile grid: 8×8)
3. **Resizing**: Fixed 1024×1024 resolution
4. **Format**: RGB format (grayscale images converted to RGB)
## Evaluation
### Metrics
The models should be evaluated using the following metrics:
- **Dice Coefficient**: Measures overlap between predicted and ground truth masks
- **IoU (Intersection over Union)**: Pixel-wise accuracy metric
- **Precision & Recall**: For detecting true vessel pixels
- **Hausdorff Distance**: Measures boundary accuracy
- **Inference Time**: Speed benchmarks on various hardware
### Performance Considerations
- **Point Prompt Quality**: Model performance heavily depends on the quality and number of point prompts
- **Image Quality**: Better results with high-contrast angiography images
- **Vessel Complexity**: Performance may vary with vessel overlap and bifurcations
- **Model Selection**: Larger models generally provide better accuracy but slower inference
## How to Use
### Installation
```bash
# Create conda environment
conda create -n sam2_FT_env python=3.10.0 -y
conda activate sam2_FT_env
# Install SAM2
git clone https://github.com/facebookresearch/segment-anything-2.git
cd segment-anything-2
pip install -e .
cd ..
# Install dependencies
pip install gradio opencv-python-headless torch torchvision torchaudio
```
### Basic Usage
```python
import torch
import numpy as np
from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor
# Load model
checkpoint_path = "ft_models/sam2_s/checkpoint_epoch_70.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
checkpoint = torch.load(checkpoint_path, map_location=device)
model_cfg = checkpoint['model_cfg']
sam2_model = build_sam2(model_cfg, checkpoint_path=None, device=device)
# Load state dict
state_dict = checkpoint['model_state_dict']
new_state_dict = {k[7:] if k.startswith('module.') else k: v
for k, v in state_dict.items()}
sam2_model.load_state_dict(new_state_dict)
sam2_model.eval()
# Create predictor
predictor = SAM2ImagePredictor(sam2_model)
# Set image (preprocessed, 1024x1024, RGB, uint8)
predictor.set_image(preprocessed_image)
# Add point prompts
point_coords = np.array([[512, 300], [520, 310]]) # x, y coordinates
point_labels = np.array([1, 1]) # 1 = positive, 0 = negative
# Predict
masks, scores, logits = predictor.predict(
point_coords=point_coords,
point_labels=point_labels,
multimask_output=True
)
```
### Interactive Application
Launch the Gradio interface:
```bash
python app.py
```
Access at `http://127.0.0.1:7860`
## Limitations
### Technical Limitations
- **Fixed Input Size**: Models expect 1024×1024 input (automatic resizing may affect small vessels)
- **Memory Requirements**: Large model requires significant GPU memory (~8GB VRAM recommended)
- **Point Dependency**: Requires manual point prompts; not fully automatic
- **Single Modality**: Optimized only for X-ray angiography
### Medical Limitations
- **Not FDA Approved**: Not cleared for clinical diagnostic use
- **Requires Expert Review**: All outputs must be validated by qualified professionals
- **Variability**: Performance may vary across different imaging protocols and equipment
- **Edge Cases**: May struggle with severe vessel overlap, calcifications, or poor image quality
### Known Issues
- High-contrast regions may cause over-segmentation
- Thin vessel branches may be missed without precise point placement
- Performance degradation on low-quality or motion-blurred images
## Ethical Considerations
### Medical AI Responsibility
- **Human Oversight Required**: This tool is designed to assist, not replace, medical professionals
- **No Autonomous Decisions**: Should never be used for automated clinical decisions
- **Training Data Bias**: Model performance may reflect biases present in training data
- **Privacy**: Ensure patient data is handled according to HIPAA/GDPR regulations
### Fairness & Bias
- Model performance across different patient demographics should be validated
- Imaging equipment and protocols may affect performance
- Consider potential biases in training dataset composition
### Transparency
- Model predictions should be explainable to medical professionals
- Segmentation boundaries should be reviewable and editable
- Point prompt influence on outputs should be clear to users
## Citation
### Base Model (SAM2)
```bibtex
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and others},
journal={arXiv preprint arXiv:2408.00714},
year={2024}
}
```
### This Work
If you use CoronarySAM2 in your research, please cite:
```bibtex
@software{coronarysam2_2025,
title={CoronarySAM2: Fine-tuned SAM2 for Coronary Artery Segmentation},
author={[Your Name/Team]},
year={2025},
url={[Repository URL]}
}
```
## Model Card Authors
- [Primary Author Names]
- Last Updated: November 2025
## Contact
For questions, issues, or collaboration inquiries:
- **GitHub Issues**: [Repository URL]/issues
- **Email**: [Contact Email]
## Disclaimer
**⚠️ IMPORTANT MEDICAL DISCLAIMER ⚠️**
This software is provided for **research and educational purposes only**. It is not intended for clinical use, medical diagnosis, or treatment planning. The models have not been validated for clinical deployment and are not FDA-approved or CE-marked medical devices.
**Always consult qualified healthcare professionals** for medical image interpretation and clinical decisions. The developers assume no liability for any clinical use or consequences resulting from the use of this software.
## Additional Resources
- [SAM2 Paper](https://arxiv.org/abs/2408.00714)
- [SAM2 GitHub Repository](https://github.com/facebookresearch/segment-anything-2)
- [Project README](README.md)
- [Application Interface](app.py)
---
**Version**: 1.0
**Last Updated**: November 18, 2025
**Status**: Research/Development