Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ tags:
|
|
| 8 |
# Reducio-VAE Model Card
|
| 9 |
|
| 10 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 11 |
-
This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling
|
| 12 |
It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
|
| 13 |
|
| 14 |
|
|
@@ -51,15 +51,15 @@ import torch
|
|
| 51 |
|
| 52 |
Metrics on 1K Pexels validation set and UCF-101:
|
| 53 |
|
| 54 |
-
|Method|Downsample Factor|
|
| 55 |
|---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
|
| 56 |
-
|SD2.1-VAE|
|
| 57 |
-
|SDXL-VAE|
|
| 58 |
-
|OmniTokenizer|
|
| 59 |
-
|OpenSora-1.2|
|
| 60 |
-
|Cosmos Tokenizer|
|
| 61 |
-
|Cosmos Tokenizer|
|
| 62 |
-
|Reducio-VAE|
|
| 63 |
|
| 64 |
|
| 65 |
## Citation
|
|
|
|
| 8 |
# Reducio-VAE Model Card
|
| 9 |
|
| 10 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 11 |
+
This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling \\(4096\times\\) downsampling.
|
| 12 |
It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
|
| 13 |
|
| 14 |
|
|
|
|
| 51 |
|
| 52 |
Metrics on 1K Pexels validation set and UCF-101:
|
| 53 |
|
| 54 |
+
|Method|Downsample Factor|\\(\|z\|\\)|PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
|
| 55 |
|---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
|
| 56 |
+
|SD2.1-VAE|\\(1\times8\times8\\)|4|29.23|0.82|0.09|25.96|21.00|
|
| 57 |
+
|SDXL-VAE|\\(1\times8\times8\\)|16|30.54|0.85|0.08|19.87|23.68|
|
| 58 |
+
|OmniTokenizer|\\(4\times8\times8\\)|8|27.11|0.89|0.07|23.88|30.52|
|
| 59 |
+
|OpenSora-1.2|\\(4\times8\times8\\)|16|30.72|0.85|0.11|60.88|67.52|
|
| 60 |
+
|Cosmos Tokenizer|\\(8\times8\times8\\)|16|30.84|0.74|0.12|29.44|22.06|
|
| 61 |
+
|Cosmos Tokenizer|\\(8\times16\times16\\)|16|28.14|0.65|0.18|77.87|119.37|
|
| 62 |
+
|Reducio-VAE|\\(4\times32\times32\\)|16|35.88|0.94|0.05|17.88|65.17|
|
| 63 |
|
| 64 |
|
| 65 |
## Citation
|