Add model card and links to paper/code

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -4
README.md CHANGED
@@ -1,7 +1,9 @@
1
  ---
2
- license: mit
3
  datasets:
4
  - antonioscardace/deepssim
 
 
 
5
  pipeline_tag: image-feature-extraction
6
  tags:
7
  - memorization
@@ -10,6 +12,46 @@ tags:
10
  - medical-imaging
11
  - brain-mri
12
  - chest-xray
13
- language:
14
- - en
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  datasets:
3
  - antonioscardace/deepssim
4
+ language:
5
+ - en
6
+ license: mit
7
  pipeline_tag: image-feature-extraction
8
  tags:
9
  - memorization
 
12
  - medical-imaging
13
  - brain-mri
14
  - chest-xray
15
+ arxiv: 2509.16582
16
+ ---
17
+
18
+ # DeepSSIM
19
+
20
+ DeepSSIM is a novel self-supervised metric for quantifying and detecting training data memorization in generative models, specifically designed for medical imaging such as brain MRI and chest X-ray synthesis.
21
+
22
+ It works by projecting images into a learned embedding space where the cosine similarity between embeddings matches the ground-truth SSIM (Structural Similarity Index) scores. This allows DeepSSIM to reliably identify data leakage and memorized content even without precise spatial alignment.
23
+
24
+ - **Paper:** [A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis](https://huggingface.co/papers/2509.16582)
25
+ - **GitHub Repository:** [brAIn-science/DeepSSIM](https://github.com/brAIn-science/DeepSSIM)
26
+
27
+ ## Overview
28
+
29
+ DeepSSIM addresses the risk of unauthorized patient information disclosure by detecting when deep generative models (like Latent Diffusion Models) memorize sensitive training data. In evaluations, DeepSSIM achieved superior performance compared to state-of-the-art memorization metrics, improving F1 scores by an average of +52.03% over the best existing method in brain MRI synthesis case studies.
30
+
31
+ ## Usage
32
+
33
+ Since the model produces an embedding for each image, the corresponding similarity matrix must first be computed to identify memorization. According to the official repository, you can use the following script to compute the matrix:
34
+
35
+ ```console
36
+ python scripts/compute_matrix.py \
37
+ --dataset_images_dir PATH \
38
+ --embeddings_dir PATH \
39
+ --matrices_dir PATH \
40
+ --indices_dir PATH \
41
+ --model_path PATH \
42
+ --metric_name {deepssim, chen, dar, semdedup} \
43
+ --use_gpu
44
+ ```
45
+
46
+ For installation and further evaluation steps, please refer to the [GitHub repository](https://github.com/brAIn-science/DeepSSIM).
47
+
48
+ ## Citation
49
+
50
+ ```bibtex
51
+ @article{scardace2025novel,
52
+ title={A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis},
53
+ author={Scardace, Antonio and Puglisi, Lemuel and Guarnera, Francesco and Battiato, Sebastiano and Rav{\`\i}, Daniele},
54
+ journal={arXiv preprint arXiv:2509.16582},
55
+ year={2025}
56
+ }
57
+ ```