schopra commited on
Commit
0c17e75
·
verified ·
1 Parent(s): 893b85c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vae
5
+ - video
6
+ - image
7
+ - autoencoder
8
+ - 3d-convolution
9
+ library_name: image-video-vae
10
+ ---
11
+
12
+ # Image-Video-VAE
13
+
14
+ 3D Convolutional VAE for encoding and decoding both images and video, trained from scratch by [Linum AI](https://linum.ai). [Read the blog post](https://www.linum.ai/field-notes/vae-reconstruction-vs-generation).
15
+
16
+ ## Model Description
17
+
18
+ A 346.6M parameter 3D convolutional autoencoder that compresses images and video into a compact latent space.
19
+
20
+ | Property | Value |
21
+ |----------|-------|
22
+ | Spatial compression | 8x |
23
+ | Temporal compression | 4x |
24
+ | Latent channels | 16 |
25
+ | Parameters | 346.6M (170.1M encoder, 176.5M decoder) |
26
+ | Pixel normalization | [0, 1] |
27
+ | Precision | bfloat16 |
28
+
29
+ ## Quick Start
30
+
31
+ **Full documentation: [GitHub - Linum-AI/image-video-vae](https://github.com/Linum-AI/image-video-vae)**
32
+
33
+ ```bash
34
+ git clone https://github.com/Linum-AI/image-video-vae.git
35
+ cd image-video-vae
36
+ uv sync
37
+ uv run python encode_decode.py --mode image --input examples/images/original/camel_closeup.jpg
38
+ ```
39
+
40
+ Weights are downloaded automatically on first run (~1.3GB).
41
+
42
+ ## Files
43
+
44
+ ```
45
+ └── vae.safetensors # VAE model weights (1.3GB)
46
+ ```
47
+
48
+ ## License
49
+
50
+ [Apache 2.0](LICENSE)
51
+
52
+ ## Citation
53
+
54
+ ```bibtex
55
+ @online{image_video_vae_2026,
56
+ title = {VAE: Reconstruction vs. Generation},
57
+ author = {Linum AI},
58
+ year = {2026},
59
+ url = {https://www.linum.ai/field-notes/vae-reconstruction-vs-generation}
60
+ }
61
+ ```