File size: 3,295 Bytes
c63cbba 20337d9 c63cbba 20337d9 c63cbba 20337d9 c63cbba 3817b7b c63cbba 20337d9 c63cbba 3817b7b dddf54e 3817b7b dddf54e c63cbba 3817b7b dddf54e 3817b7b ff97d7a 3817b7b c63cbba 20337d9 3817b7b dddf54e 3817b7b c63cbba 3817b7b ff97d7a 20337d9 ff97d7a 3817b7b 20337d9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | ---
datasets:
- EchoDynamic
- RVENet
- EchoNet-Pediatric-LVH
language: en
library_name: pytorch
license: mit
tags:
- self-supervised-learning
- echocardiography
- medical-imaging
- video-representation
model_index: deep-learning
paper: https://arxiv.org/pdf/2506.11777
pipeline_tag: video-feature-extraction
---
# π« DISCOVR β Self-Supervised Echocardiography Representations
**Paper:** *Self-Supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation* β NeurIPS 2025
π [arXiv:2506.11777](https://arxiv.org/pdf/2506.11777)
**Code:** [https://github.com/mdivyanshu97/DISCOVR](https://github.com/mdivyanshu97/DISCOVR)
---
## π¦ Available Checkpoints
| Epochs | Filename | Description |
|:-------:|:-----------|:-------------|
| 200 | `checkpoint-199.pth` | Model trained for ~200 epochs |
| 300 | `checkpoint-299.pth` | Model trained for ~300 epochs |
| 400 | `checkpoint-399.pth` | Model trained for ~400 epochs |
| 600 | `checkpoint-599.pth` | Model trained for ~600 epochs |
| 800 | `checkpoint-799.pth` | Model trained for ~800 epochs |
> Each checkpoint corresponds to a model trained for the indicated number of epochs on **adult and pediatric echocardiography datasets** (EchoDynamic, RVENet, EchoNet-Pediatric LVH).
---
## π§ Model Overview
DISCOVR is a self-supervised framework for learning spatio-temporal echocardiographic video representations via **online cluster distillation**.
It learns both fine-grained anatomical semantics and global temporal dynamics, supporting downstream tasks such as:
- Cardiac view classification
- Functional abnormality detection
- Video segmentation
- Representation learning for medical imaging
**Not for clinical or diagnostic use.**
---
## Sample Usage
To pretrain the model on echocardiographic videos:
```bash
python -m torch.distributed.launch --nproc_per_node=NUM_GPUS \
scripts/run_mae_pretraining.py \
--data_path /path/to/echo_videos \
--data_path_csv /path/to/train.csv \
--data_path_val /path/to/val.csv \
--data_path_test /path/to/test.csv \
--mask_type multi_local \
--loss_func SIGMA \
--model pretrain_videomae_base_patch16_224 \
--batch_size 48 \
--num_frames 64 \
--opt adamw \
--opt_betas 0.9 0.95 \
--warmup_epochs 40 \
--epochs 400
```
---
## π Quick Facts
- **Repo:** `Div97/DISCOVR_ADULT_PEDIATRIC_MODEL`
- **Model family:** DISCOVR checkpoints (199 β 799)
- **Architecture:** ViT-Base backbone, 64-frame clips (stride 3)
- **Datasets used:** EchoDynamic, RVENet, EchoNet-Pediatric LVH
- **Training objective:** Self-supervised online cluster distillation
- **Intended use:** Research & education
- **Not intended for:** Clinical decision-making or real-world patient care
---
## π§© Citation
If you use DISCOVR in your work, please cite:
```bibtex
@article{mishra2025self,
title={Self-supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation},
author={Mishra, Divyanshu and Salehi, Mohammadreza and Saha, Pramit Saha and Patey, Olga and Papageorghiou, Aris T and Asano, Yuki M and Noble, J Alison},
journal={arXiv preprint arXiv:2506.11777},
year={2025}
}
```
---
## License
This project is licensed under the MIT License. |