KVAE-3D-1.0 / README.md
dcdenis's picture
Update README.md
b4f8083 verified
metadata
library_name: KVAE 3D
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
  - vae
license: mit
Shows an illustrated sun in light mode and a moon with stars in dark mode.

KVAE-3D 1.0: Video tokenizer

KVAE-3D model has time compression 4, spacial compression 8x8 and 16 latent channels

Evaluation results

Reconstructions comparison of KVAE-3D and Hunyuan:

kvae3d_comparison

Evaluation results of KVAE-3D model on MCL-JCV dataset. All compared models perform 4x8x8 compression with 16 latent channels:

Model PSNR SSIM LPIPS
Wan-2.1 33.75 0.90 0.089
HunyuanVideo 33.91 0.91 0.103
KVAE-3D 35.63 0.92 0.088