Vittorio Pippi
commited on
Commit
·
dd90e2c
1
Parent(s):
9b27178
Fix the YAML metadata
Browse files
README.md
CHANGED
|
@@ -1,6 +1,3 @@
|
|
| 1 |
-
# Emuru Convolutional VAE
|
| 2 |
-
|
| 3 |
-
```yaml
|
| 4 |
---
|
| 5 |
language:
|
| 6 |
- "en"
|
|
@@ -18,9 +15,8 @@ metrics:
|
|
| 18 |
- CER
|
| 19 |
library_name: diffusers
|
| 20 |
---
|
| 21 |
-
```
|
| 22 |
|
| 23 |
-
##
|
| 24 |
|
| 25 |
This repository hosts the **Emuru Convolutional VAE**, described in our paper. The model features a convolutional encoder and decoder, each with four layers. The output channels for these layers are 32, 64, 128, and 256, respectively. The encoder downsamples an input RGB image \( I \in \mathbb{R}^{3 \times W \times H} \) to a latent representation with a single channel and spatial dimensions \( h \times w \) (where \( h = H/8 \) and \( w = W/8 \)). This design compresses the style information in the image, allowing a lightweight Transformer Decoder to efficiently process the latent features.
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- "en"
|
|
|
|
| 15 |
- CER
|
| 16 |
library_name: diffusers
|
| 17 |
---
|
|
|
|
| 18 |
|
| 19 |
+
## Emuru Convolutional VAE
|
| 20 |
|
| 21 |
This repository hosts the **Emuru Convolutional VAE**, described in our paper. The model features a convolutional encoder and decoder, each with four layers. The output channels for these layers are 32, 64, 128, and 256, respectively. The encoder downsamples an input RGB image \( I \in \mathbb{R}^{3 \times W \times H} \) to a latent representation with a single channel and spatial dimensions \( h \times w \) (where \( h = H/8 \) and \( w = W/8 \)). This design compresses the style information in the image, allowing a lightweight Transformer Decoder to efficiently process the latent features.
|
| 22 |
|