# 🌊 LiquidDiffusion

**A novel attention-free image generation model based on Liquid Neural Networks**

## What is this?

LiquidDiffusion is a **first-of-its-kind** image generation model that replaces attention with **Parallel CfC (Closed-form Continuous-depth) blocks** from Liquid Neural Network research. No existing paper combines LNNs with image generation — this fills that gap.

### Key Properties
- ✅ **Zero attention layers** — fully convolutional + liquid time-gating
- ✅ **Fully parallelizable** — no ODE solvers, no sequential scanning, no recurrence
- ✅ **Latent space training** — uses pretrained SD-VAE (stabilityai/sd-vae-ft-mse, 83.7M frozen)
- ✅ **Fits 16GB VRAM** — tiny config runs 256px at batch=8 on T4 GPU
- ✅ **Simple training** — Rectified Flow (MSE velocity prediction, no noise schedule)
- ✅ **6 verified datasets** — all tested and working with streaming support

## Quick Start (Colab)

1. Open `LiquidDiffusion_Training.ipynb` in Colab
2. Select GPU runtime (T4)
3. Pick a dataset from the dropdown (default: huggan/AFHQv2 — animal faces)
4. Run all cells → training starts, samples generated every 500 steps

## Architecture

```
Pixel Image (3×256×256)
    → [Frozen SD-VAE Encode] → Latent (4×32×32)
    → [LiquidDiffusion U-Net] → Velocity prediction (4×32×32)
    → [Frozen SD-VAE Decode] → Generated Image (3×256×256)
```

Each **LiquidDiffusionBlock** contains:
1. **AdaLN** — timestep conditioning via learned scale/shift
2. **ParallelCfCBlock** — the core liquid neural network layer (CfC Eq.10)
3. **MultiScaleSpatialMix** — 3×3+5×5+7×7 depthwise conv + global pooling (replaces attention)
4. **FeedForward** — channel mixing via 1×1 conv

### The ParallelCfC Block

```python
# CfC Eq.10 adapted for images:
gate = σ(time_a(t_emb) · f(features) - time_b(t_emb))   # liquid time-gating
out = gate · g(features) + (1 - gate) · h(features)       # CfC interpolation
α = exp(-λ · |t_emb|)                                     # liquid relaxation
output = α · input + (1 - α) · out                         # time-aware residual
```

## Verified Datasets

All tested and working (with streaming support):

| Dataset | Images | Description | Native Resolution |
|---------|--------|-------------|-------------------|
| `huggan/AFHQv2` | 16K | Animal faces (cats, dogs, wildlife) | 512×512 |
| `nielsr/CelebA-faces` | 202K | Celebrity faces | 178×218 |
| `huggan/flowers-102-categories` | 8K | Flower photographs | Variable |
| `reach-vb/pokemon-blip-captions` | 833 | Pokemon illustrations | 1280×1280 |
| `huggan/anime-faces` | 63K | Anime faces | 64×64 |
| `Norod78/cartoon-blip-captions` | ~3K | Cartoon characters | 512×512 |

## VAE

Uses **stabilityai/sd-vae-ft-mse** (83.7M params, frozen during training):
- 4 latent channels, 8× spatial downscale
- PSNR 27.3 on LAION-Aesthetics (excellent reconstruction)
- ~160MB VRAM in fp16
- Scaling factor: 0.18215

## Model Configs

| Config | Params | 256px VRAM (w/ VAE) | 512px VRAM |
|--------|--------|---------------------|------------|
| tiny | ~23M | ~6 GB | ~12 GB |
| small | ~69M | ~10 GB | ~20 GB |
| base | ~154M | ~16 GB | ~30 GB |

## Training

**Objective**: Rectified Flow — simple MSE on velocity
```python
x_t = (1 - t) · x0 + t · noise     # linear interpolation
v_target = noise - x0                # constant velocity
loss = MSE(model(x_t, t), v_target)  # that's it!
```

**Sampling**: Euler ODE integration, 25-50 steps

## References

| Paper | Contribution |
|-------|-------------|
| [CfC Networks (Nature MI 2022)](https://arxiv.org/abs/2106.13898) | CfC Eq.10, parallelizable closed-form |
| [LTC Networks (AAAI 2021)](https://arxiv.org/abs/2006.04439) | Liquid time-constant ODE |
| [LiquidTAD (2024)](https://arxiv.org/abs/2604.18274) | Parallel liquid relaxation |
| [USM (CVPR 2025)](https://arxiv.org/abs/2504.13499) | U-Net + SSM for diffusion |
| [DiffuSSM (2023)](https://arxiv.org/abs/2311.18257) | SSM replaces attention in diffusion |
| [Rectified Flow (ICLR 2023)](https://arxiv.org/abs/2209.03003) | Simple velocity training |

## Files

```
├── liquid_diffusion/
│   ├── __init__.py
│   ├── model.py             # Full model architecture
│   └── trainer.py           # Trainer + dataset utilities
├── LiquidDiffusion_Training.ipynb  # Complete Colab notebook
├── test_model.py
└── README.md
```

## License
MIT