Text-to-Image
diffusion
safety
dose-response
felfri commited on
Commit
4645d60
·
verified ·
1 Parent(s): 5176b74

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - diffusion
5
+ - text-to-image
6
+ - safety
7
+ - dose-response
8
+ base_model: Photoroom/PRX
9
+ datasets:
10
+ - lehduong/flux_generated
11
+ - LucasFang/FLUX-Reason-6M
12
+ - brivangl/midjourney-v6-llava
13
+ pipeline_tag: text-to-image
14
+ ---
15
+
16
+ # Dose-Response C0: 0% unsafe, full scale
17
+
18
+ This model is part of a **dose-response experiment** studying how the fraction of unsafe content in training data affects the safety of generated images from text-to-image diffusion models.
19
+
20
+ ## Model Details
21
+
22
+ | | |
23
+ |---|---|
24
+ | **Architecture** | PRX-1.2B (Photoroom diffusion model) |
25
+ | **Parameters** | 1.2B (denoiser only) |
26
+ | **Resolution** | 512px |
27
+ | **Condition** | C0 — 0% unsafe, full scale |
28
+ | **Unsafe fraction** | 0% |
29
+ | **Training set size** | ~7.85M images |
30
+ | **Training steps** | 100K batches |
31
+ | **Batch size** | 1024 (global) |
32
+ | **Precision** | bf16 |
33
+ | **Hardware** | 8x H200 GPUs |
34
+
35
+ ## Condition Description
36
+
37
+ All unsafe images removed. Training uses only the safe pool (7.85M safe images).
38
+
39
+ ## Dose-Response Conditions Overview
40
+
41
+ This model is one of 7 conditions in the dose-response experiment:
42
+
43
+ | Condition | Unsafe Fraction | Dataset Scale | Description |
44
+ |-----------|----------------|---------------|-------------|
45
+ | **C0** | 0% | Full (~7.85M) | All unsafe removed |
46
+ | **C1** | 5% | Full (~8.24M) | Unsafe oversampled to 5% |
47
+ | **C2** | 10% | Full (~8.72M) | Unsafe oversampled to 10% |
48
+ | **C3** | ~1.21% | Full (~7.94M) | Original composition |
49
+ | **C4** | ~1.21% | 1M | Original proportion, downscaled |
50
+ | **C5** | ~9.6% | 1M | All unsafe included, downscaled |
51
+ | **C6** | ~1.21% | 100K | Original proportion, small scale |
52
+
53
+ ## Training Details
54
+
55
+ - **Base architecture**: [PRX](https://github.com/Photoroom/PRX) 1.2B
56
+ - **Text encoder**: T5-Gemma-2B (frozen)
57
+ - **VAE**: Identity (no compression)
58
+ - **Optimizer**: Muon
59
+ - **Algorithms**: TREAD + REPA-v3 + LPIPS + Perceptual DINO + EMA
60
+ - **EMA smoothing**: 0.999 (updated every 10 batches)
61
+ - **Training data sources**: `lehduong/flux_generated`, `LucasFang/FLUX-Reason-6M`, `brivangl/midjourney-v6-llava`
62
+ - **Safety annotations**: Training data annotated with [LlavaGuard-7B](https://huggingface.co/AIML-TUDA/LlavaGuard-v1.2-7B-OV) to classify images as safe/unsafe
63
+
64
+ ## Files
65
+
66
+ - `denoiser.pt` — Consolidated single-file checkpoint (EMA weights, ready for inference)
67
+ - `distributed/` — Original FSDP distributed checkpoint shards
68
+ - `config.yaml` — Full Hydra training configuration
69
+
70
+ ## Usage
71
+
72
+ ```python
73
+ import torch
74
+
75
+ # Load consolidated checkpoint
76
+ state_dict = torch.load("denoiser.pt", map_location="cpu")
77
+ # Keys are in format: denoiser.*
78
+ ```
79
+
80
+ For the full generation pipeline, see the [diffusion_safety](https://github.com/felifri/diffusion_safety) repository.
81
+
82
+ ## Citation
83
+
84
+ If you use these models, please cite the associated paper and the PRX architecture.
85
+
86
+ ## License
87
+
88
+ Apache 2.0