lavinal712 commited on
Commit
799a0ee
·
verified ·
1 Parent(s): e6fc2f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -2
README.md CHANGED
@@ -23,7 +23,23 @@ This model was trained for 7 epochs on ImageNet, with training parameters follow
23
 
24
  $$\mathcal{L}_{\mathrm{VAE}} = \mathcal{L}_1 + \mathcal{L}_{\mathrm{LPIPS}} + 0.5\mathcal{L}_{\mathrm{GAN}} + 0.2\mathcal{L}_{\mathrm{ID}} + 0.000001\mathcal{L}_{\mathrm{KL}}$$
25
 
26
- ## Evaluation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ImageNet 2012 (256x256, val, 50000 images)
29
 
@@ -34,6 +50,6 @@ ImageNet 2012 (256x256, val, 50000 images)
34
 
35
  Paper: [Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model](https://arxiv.org/abs/2408.11039)
36
 
37
- Dataset: [ImageNet](https://image-net.org/)
38
 
39
  Base Code: [lavinal712/AutoencoderKL](https://github.com/lavinal712/AutoencoderKL)
 
23
 
24
  $$\mathcal{L}_{\mathrm{VAE}} = \mathcal{L}_1 + \mathcal{L}_{\mathrm{LPIPS}} + 0.5\mathcal{L}_{\mathrm{GAN}} + 0.2\mathcal{L}_{\mathrm{ID}} + 0.000001\mathcal{L}_{\mathrm{KL}}$$
25
 
26
+ ## Evaluation
27
+
28
+ ImageNet 2012 (256x256, val, 50000 images)
29
+
30
+ | Model | rFID | PSNR | SSIM | LPIPS |
31
+ |-----------------|-------|--------|-------|-------|
32
+ | Transfusion-VAE | 0.408 | 28.723 | 0.845 | 0.081 |
33
+ | SD-VAE | 0.692 | 26.910 | 0.772 | 0.130 |
34
+
35
+ COCO 2017 (256x256, val, 5000 images)
36
+
37
+ | Model | rFID | PSNR | SSIM | LPIPS |
38
+ |-----------------|-------|--------|-------|-------|
39
+ | Transfusion-VAE | 2.749 | 28.556 | 0.855 | 0.078 |
40
+ | SD-VAE | 4.246 | 26.622 | 0.784 | 0.127 |
41
+
42
+ ## Evaluation (legacy)
43
 
44
  ImageNet 2012 (256x256, val, 50000 images)
45
 
 
50
 
51
  Paper: [Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model](https://arxiv.org/abs/2408.11039)
52
 
53
+ Dataset: [ImageNet](https://image-net.org/), [COCO](https://cocodataset.org/), [FFHQ](https://github.com/NVlabs/ffhq-dataset)
54
 
55
  Base Code: [lavinal712/AutoencoderKL](https://github.com/lavinal712/AutoencoderKL)