antonioscardace
/

deepssim

Image Feature Extraction

generative-models

diffusion-models

medical-imaging

Model card Files Files and versions

Add model card and links to paper/code

#1

by nielsr HF Staff - opened Dec 30, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +46 -4

README.md CHANGED Viewed

@@ -1,7 +1,9 @@
 ---
-license: mit
 datasets:
 - antonioscardace/deepssim
 pipeline_tag: image-feature-extraction
 tags:
 - memorization
@@ -10,6 +12,46 @@ tags:
 - medical-imaging
 - brain-mri
 - chest-xray
-language:
-- en
----

 ---
 datasets:
 - antonioscardace/deepssim
+language:
+- en
+license: mit
 pipeline_tag: image-feature-extraction
 tags:
 - memorization
 - medical-imaging
 - brain-mri
 - chest-xray
+arxiv: 2509.16582
+---
+# DeepSSIM
+DeepSSIM is a novel self-supervised metric for quantifying and detecting training data memorization in generative models, specifically designed for medical imaging such as brain MRI and chest X-ray synthesis.
+It works by projecting images into a learned embedding space where the cosine similarity between embeddings matches the ground-truth SSIM (Structural Similarity Index) scores. This allows DeepSSIM to reliably identify data leakage and memorized content even without precise spatial alignment.
+- **Paper:** [A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis](https://huggingface.co/papers/2509.16582)
+- **GitHub Repository:** [brAIn-science/DeepSSIM](https://github.com/brAIn-science/DeepSSIM)
+## Overview
+DeepSSIM addresses the risk of unauthorized patient information disclosure by detecting when deep generative models (like Latent Diffusion Models) memorize sensitive training data. In evaluations, DeepSSIM achieved superior performance compared to state-of-the-art memorization metrics, improving F1 scores by an average of +52.03% over the best existing method in brain MRI synthesis case studies.
+## Usage
+Since the model produces an embedding for each image, the corresponding similarity matrix must first be computed to identify memorization. According to the official repository, you can use the following script to compute the matrix:
+```console
+python scripts/compute_matrix.py \
+  --dataset_images_dir PATH \
+  --embeddings_dir PATH \
+  --matrices_dir PATH \
+  --indices_dir PATH \
+  --model_path PATH \
+  --metric_name {deepssim, chen, dar, semdedup} \
+  --use_gpu
+```
+For installation and further evaluation steps, please refer to the [GitHub repository](https://github.com/brAIn-science/DeepSSIM).
+## Citation
+```bibtex
+@article{scardace2025novel,
+    title={A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis},
+    author={Scardace, Antonio and Puglisi, Lemuel and Guarnera, Francesco and Battiato, Sebastiano and Rav{\`\i}, Daniele},
+    journal={arXiv preprint arXiv:2509.16582},
+    year={2025}
+}
+```