Add model card and links to paper/code
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,7 +1,9 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
datasets:
|
| 4 |
- antonioscardace/deepssim
|
|
|
|
|
|
|
|
|
|
| 5 |
pipeline_tag: image-feature-extraction
|
| 6 |
tags:
|
| 7 |
- memorization
|
|
@@ -10,6 +12,46 @@ tags:
|
|
| 10 |
- medical-imaging
|
| 11 |
- brain-mri
|
| 12 |
- chest-xray
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
datasets:
|
| 3 |
- antonioscardace/deepssim
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: mit
|
| 7 |
pipeline_tag: image-feature-extraction
|
| 8 |
tags:
|
| 9 |
- memorization
|
|
|
|
| 12 |
- medical-imaging
|
| 13 |
- brain-mri
|
| 14 |
- chest-xray
|
| 15 |
+
arxiv: 2509.16582
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# DeepSSIM
|
| 19 |
+
|
| 20 |
+
DeepSSIM is a novel self-supervised metric for quantifying and detecting training data memorization in generative models, specifically designed for medical imaging such as brain MRI and chest X-ray synthesis.
|
| 21 |
+
|
| 22 |
+
It works by projecting images into a learned embedding space where the cosine similarity between embeddings matches the ground-truth SSIM (Structural Similarity Index) scores. This allows DeepSSIM to reliably identify data leakage and memorized content even without precise spatial alignment.
|
| 23 |
+
|
| 24 |
+
- **Paper:** [A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis](https://huggingface.co/papers/2509.16582)
|
| 25 |
+
- **GitHub Repository:** [brAIn-science/DeepSSIM](https://github.com/brAIn-science/DeepSSIM)
|
| 26 |
+
|
| 27 |
+
## Overview
|
| 28 |
+
|
| 29 |
+
DeepSSIM addresses the risk of unauthorized patient information disclosure by detecting when deep generative models (like Latent Diffusion Models) memorize sensitive training data. In evaluations, DeepSSIM achieved superior performance compared to state-of-the-art memorization metrics, improving F1 scores by an average of +52.03% over the best existing method in brain MRI synthesis case studies.
|
| 30 |
+
|
| 31 |
+
## Usage
|
| 32 |
+
|
| 33 |
+
Since the model produces an embedding for each image, the corresponding similarity matrix must first be computed to identify memorization. According to the official repository, you can use the following script to compute the matrix:
|
| 34 |
+
|
| 35 |
+
```console
|
| 36 |
+
python scripts/compute_matrix.py \
|
| 37 |
+
--dataset_images_dir PATH \
|
| 38 |
+
--embeddings_dir PATH \
|
| 39 |
+
--matrices_dir PATH \
|
| 40 |
+
--indices_dir PATH \
|
| 41 |
+
--model_path PATH \
|
| 42 |
+
--metric_name {deepssim, chen, dar, semdedup} \
|
| 43 |
+
--use_gpu
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
For installation and further evaluation steps, please refer to the [GitHub repository](https://github.com/brAIn-science/DeepSSIM).
|
| 47 |
+
|
| 48 |
+
## Citation
|
| 49 |
+
|
| 50 |
+
```bibtex
|
| 51 |
+
@article{scardace2025novel,
|
| 52 |
+
title={A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis},
|
| 53 |
+
author={Scardace, Antonio and Puglisi, Lemuel and Guarnera, Francesco and Battiato, Sebastiano and Rav{\`\i}, Daniele},
|
| 54 |
+
journal={arXiv preprint arXiv:2509.16582},
|
| 55 |
+
year={2025}
|
| 56 |
+
}
|
| 57 |
+
```
|