daiqi commited on
Commit
d14a57d
·
verified ·
1 Parent(s): 0c9b1c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  # Reducio-VAE Model Card
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
- This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling $4096\times$ downsampling.
12
  It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
13
 
14
 
@@ -51,15 +51,15 @@ import torch
51
 
52
  Metrics on 1K Pexels validation set and UCF-101:
53
 
54
- |Method|Downsample Factor|$\|z\|$|PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
55
  |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
56
- |SD2.1-VAE|$1\times8\times8$|4|29.23|0.82|0.09|25.96|21.00|
57
- |SDXL-VAE|$1\times8\times8$|16|30.54|0.85|0.08|19.87|23.68|
58
- |OmniTokenizer|$4\times8\times8$|8|27.11|0.89|0.07|23.88|30.52|
59
- |OpenSora-1.2|$4\times8\times8$|16|30.72|0.85|0.11|60.88|67.52|
60
- |Cosmos Tokenizer|$8\times8\times8$|16|30.84|0.74|0.12|29.44|22.06|
61
- |Cosmos Tokenizer|$8\times16\times16$|16|28.14|0.65|0.18|77.87|119.37|
62
- |Reducio-VAE|$4\times32\times32$|16|35.88|0.94|0.05|17.88|65.17|
63
 
64
 
65
  ## Citation
 
8
  # Reducio-VAE Model Card
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
+ This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling \\(4096\times\\) downsampling.
12
  It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
13
 
14
 
 
51
 
52
  Metrics on 1K Pexels validation set and UCF-101:
53
 
54
+ |Method|Downsample Factor|\\(\|z\|\\)|PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
55
  |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
56
+ |SD2.1-VAE|\\(1\times8\times8\\)|4|29.23|0.82|0.09|25.96|21.00|
57
+ |SDXL-VAE|\\(1\times8\times8\\)|16|30.54|0.85|0.08|19.87|23.68|
58
+ |OmniTokenizer|\\(4\times8\times8\\)|8|27.11|0.89|0.07|23.88|30.52|
59
+ |OpenSora-1.2|\\(4\times8\times8\\)|16|30.72|0.85|0.11|60.88|67.52|
60
+ |Cosmos Tokenizer|\\(8\times8\times8\\)|16|30.84|0.74|0.12|29.44|22.06|
61
+ |Cosmos Tokenizer|\\(8\times16\times16\\)|16|28.14|0.65|0.18|77.87|119.37|
62
+ |Reducio-VAE|\\(4\times32\times32\\)|16|35.88|0.94|0.05|17.88|65.17|
63
 
64
 
65
  ## Citation