microsoft
/

Reducio-VAE

Video-Generation

Model card Files Files and versions

Add pipeline tag

#1

by nielsr HF Staff - opened Aug 13, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -3,12 +3,13 @@ license: mit
 tags:
 - VAE
 - Video-Generation
 ---
 # Reducio-VAE Model Card
 <!-- Provide a quick summary of what the model is/does. -->
-This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling 4096x downsampling.
 It is part of the [Reducio-DiT](https://arxiv.org/abs/2411.13552), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
@@ -18,8 +19,8 @@ It is part of the [Reducio-DiT](https://arxiv.org/abs/2411.13552), which is a vi
 <!-- Provide the basic links for the model. -->
-- **Repository:** [GitHub Repository](https://github.com/microsoft/Reducio-VAE)
-- **Paper:** [arXiv](https://arxiv.org/abs/2411.13552)
 ## Uses
@@ -43,7 +44,7 @@ The model is typically used for supporting training a video diffusion model. Aft
 Metrics on 1K Pexels validation set and UCF-101:
-|Method|Downsample Factor|\|z\||PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
 |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
 |SD2.1-VAE|1\*8\*8|4|29.23|0.82|0.09|25.96|21.00|
 |SDXL-VAE|1\*8\*8|16|30.54|0.85|0.08|19.87|23.68|

 tags:
 - VAE
 - Video-Generation
+pipeline_tag: image-to-video
 ---
 # Reducio-VAE Model Card
 <!-- Provide a quick summary of what the model is/does. -->
+This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling 4096x downsampling.
 It is part of the [Reducio-DiT](https://arxiv.org/abs/2411.13552), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
 <!-- Provide the basic links for the model. -->
+-   **Repository:** [GitHub Repository](https://github.com/microsoft/Reducio-VAE)
+-   **Paper:** [arXiv](https://arxiv.org/abs/2411.13552)
 ## Uses
 Metrics on 1K Pexels validation set and UCF-101:
+|Method|Downsample Factor|\\|z\\||PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
 |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
 |SD2.1-VAE|1\*8\*8|4|29.23|0.82|0.09|25.96|21.00|
 |SDXL-VAE|1\*8\*8|16|30.54|0.85|0.08|19.87|23.68|