File size: 2,261 Bytes
d1eb833 3a83134 d1eb833 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
---
license: cc-by-nc-4.0
tags:
- vae
- image-generation
- diffusion
- complexity-diffusion
library_name: pytorch
pipeline_tag: image-to-image
---
# Complexity-Diffusion VAE
Variational Autoencoder for Complexity-Diffusion image generation pipeline.
## Architecture
**89M parameters** | 256x256 images | 4-channel latent space
### Encoder
$$z = \mathcal{E}(x) \in \mathbb{R}^{32 \times 32 \times 4}$$
Compresses 256x256x3 images to 32x32x4 latents (8x spatial compression).
### Decoder
$$\hat{x} = \mathcal{D}(z) \in \mathbb{R}^{256 \times 256 \times 3}$$
### Loss Function
$$\mathcal{L} = \mathcal{L}_{\text{recon}} + \beta \cdot D_{KL}(q(z|x) \| p(z)) + \lambda \cdot \mathcal{L}_{\text{perceptual}}$$
Where:
- $\mathcal{L}_{\text{recon}} = \|x - \hat{x}\|_1$ (L1 reconstruction)
- $D_{KL}$ regularizes latent to $\mathcal{N}(0, I)$
- $\mathcal{L}_{\text{perceptual}}$ uses VGG features
## Config
| Parameter | Value |
|-----------|-------|
| Image size | 256x256 |
| Latent dim | 4 |
| Base channels | 128 |
| Channel mult | [1, 2, 4, 4] |
| Res blocks | 2 |
## Usage
```python
from safetensors.torch import load_file
from complexity_diffusion.vae import ComplexityVAE
# Load
state_dict = load_file("model.safetensors")
vae = ComplexityVAE(image_size=256, base_channels=128, latent_dim=4)
vae.load_state_dict(state_dict)
# Encode
latents = vae.encode(images) # [B, 4, 32, 32]
# Decode
reconstructed = vae.decode(latents) # [B, 3, 256, 256]
```
## Training
Trained on WikiArt (81K images) for 15K steps with:
- Batch size: 16
- Learning rate: 1e-4
- Mixed precision: bf16
### Training Curves

## Part of Complexity Deep Ecosystem
This VAE is designed to work with the Complexity-Diffusion pipeline, leveraging:
- **INL Dynamics** for stable latent space training
- **Token-Routed architecture** for efficient processing
## Links
- [Complexity Deep](https://huggingface.co/Pacific-Prime)
- [PyPI Package](https://pypi.org/project/complexity-deep/)
- [GitHub](https://github.com/Complexity-ML/complexity-framework)
- [PyPI](https://pypi.org/project/complexity-framework/)
## License
CC BY-NC 4.0 - Attribution-NonCommercial
Commercial use requires explicit permission from the author.
|