Initial release: LiquidFlow architecture + training code + notebook
Browse files
README.md
ADDED
|
@@ -0,0 +1,206 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- image-generation
|
| 5 |
+
- flow-matching
|
| 6 |
+
- liquid-neural-networks
|
| 7 |
+
- mamba
|
| 8 |
+
- state-space-models
|
| 9 |
+
- physics-informed
|
| 10 |
+
- lightweight
|
| 11 |
+
- mobile-friendly
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# π LiquidFlow β Liquid-SSM Flow Matching Image Generator
|
| 15 |
+
|
| 16 |
+
A **novel lightweight architecture** for image generation that combines:
|
| 17 |
+
|
| 18 |
+
| Component | Source | Role |
|
| 19 |
+
|-----------|--------|------|
|
| 20 |
+
| **Liquid Time-Constant Networks** | [Hasani et al. 2020](https://arxiv.org/abs/2006.04439) | Adaptive ODE dynamics via CfC closed-form β bounded by construction |
|
| 21 |
+
| **Selective State Space Models** | [Gu & Dao 2023 (Mamba)](https://arxiv.org/abs/2312.00752) | Linear-time long-range context, parallelizable scanning |
|
| 22 |
+
| **Zigzag Scanning** | [ZigMa 2024](https://arxiv.org/abs/2403.13802) | 2D spatial awareness through alternating scan patterns |
|
| 23 |
+
| **Physics-Informed Loss** | [Wang et al. 2020](https://arxiv.org/abs/2001.04536), [PIDM 2024](https://arxiv.org/abs/2403.14404) | Smoothness + TV regularization for training stability |
|
| 24 |
+
| **Rectified Flow Matching** | [Lipman et al. 2022](https://arxiv.org/abs/2210.02747) | ODE-based generation β no noise schedule tuning needed |
|
| 25 |
+
|
| 26 |
+
## π― Key Properties
|
| 27 |
+
|
| 28 |
+
- **Trainable on Google Colab free tier** (T4 16GB) and Kaggle
|
| 29 |
+
- **Mobile-deployable** β tiny model is only ~6M params (~24MB)
|
| 30 |
+
- **No custom CUDA kernels** β pure PyTorch, runs anywhere
|
| 31 |
+
- **No training collapse/explosion** β sigmoid gating in Liquid CfC guarantees bounded dynamics
|
| 32 |
+
- **No noise schedule tuning** β flow matching uses simple linear interpolation
|
| 33 |
+
|
| 34 |
+
## π Architecture
|
| 35 |
+
|
| 36 |
+
```
|
| 37 |
+
Noise xβ ~ N(0,I) βββ LiquidFlow v_ΞΈ(xβ, t) βββ Image xβ
|
| 38 |
+
β
|
| 39 |
+
ββββββββ΄βββββββ
|
| 40 |
+
β Patchify β (image β non-overlapping patches)
|
| 41 |
+
β + PosEmb β (2D learnable positions)
|
| 42 |
+
β + DepthConvβ (local structure preservation)
|
| 43 |
+
ββββββββ¬βββββββ
|
| 44 |
+
β
|
| 45 |
+
ββββββββββββββΌβββββββββββββ
|
| 46 |
+
β L Γ LiquidSSM Block β
|
| 47 |
+
β ββββββββββββββββββββ β
|
| 48 |
+
β β AdaLN (t-cond) β β β DiT-style conditioning
|
| 49 |
+
β β Zigzag Scan β β β rotates scan pattern per layer
|
| 50 |
+
β β SelectiveSSM β β β Mamba-style, input-dependent A,B,C,Ξ
|
| 51 |
+
β β + LiquidCfC β β β CfC gating: Ο(-f_Ο)βh + (1-Ο(-f_Ο))βf_x
|
| 52 |
+
β β + FFN β β β GELU feed-forward
|
| 53 |
+
β β + Skip Connect β β β U-Net style long skips
|
| 54 |
+
β ββββββββββββββββββββ β
|
| 55 |
+
ββββββββββββββΌβββββββββββββ
|
| 56 |
+
β
|
| 57 |
+
ββββββββ΄βββββββ
|
| 58 |
+
β DepthConv β (local refinement)
|
| 59 |
+
β Unpatchify β (patches β image)
|
| 60 |
+
ββββββββ¬βββββββ
|
| 61 |
+
β
|
| 62 |
+
velocity v_ΞΈ (same shape as input)
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
### Core Innovation: Liquid CfC Cell
|
| 66 |
+
|
| 67 |
+
Instead of solving the Liquid ODE numerically (sequential, slow):
|
| 68 |
+
```
|
| 69 |
+
dx/dt = -[1/Ο + f(x,I,t)] * x + f(x,I,t)
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
We use the **Closed-form Continuous-depth (CfC)** solution (parallel, fast, stable):
|
| 73 |
+
```python
|
| 74 |
+
gate = sigmoid(-f_tau(x, h)) # time-constant gating
|
| 75 |
+
new_h = gate * h + (1 - gate) * f_x(x, h) # bounded update
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
The **sigmoid gating guarantees** that hidden states stay bounded β no explosion or collapse possible by construction.
|
| 79 |
+
|
| 80 |
+
### Dual-Path Processing
|
| 81 |
+
|
| 82 |
+
Each LiquidSSM Block has two parallel branches:
|
| 83 |
+
1. **SSM Branch**: Selective scan (Mamba-style) with zigzag patterns β captures global spatial dependencies
|
| 84 |
+
2. **Liquid Branch**: CfC cell β adds continuous-time adaptive dynamics
|
| 85 |
+
|
| 86 |
+
A learnable mixing coefficient `Ξ±` balances them: `output = Ξ±Β·SSM + (1-Ξ±)Β·Liquid`
|
| 87 |
+
|
| 88 |
+
## π Model Variants
|
| 89 |
+
|
| 90 |
+
| Variant | Params | Image Size | Patch | GPU VRAM (bs=16) | Use Case |
|
| 91 |
+
|---------|--------|------------|-------|-----------------|----------|
|
| 92 |
+
| `tiny` | 5.9M | 128Γ128 | 4 | ~4 GB | Quick experiments, mobile |
|
| 93 |
+
| `small` | 13.7M | 128Γ128 | 4 | ~8 GB | Production 128Γ128 |
|
| 94 |
+
| `base` | 37.6M | 256Γ256 | 8 | ~12 GB | High quality |
|
| 95 |
+
| `512` | 38.1M | 512Γ512 | 16 | ~14 GB | High resolution |
|
| 96 |
+
|
| 97 |
+
## π Quick Start
|
| 98 |
+
|
| 99 |
+
### Colab / Kaggle (Recommended)
|
| 100 |
+
|
| 101 |
+
Open the notebook: **`LiquidFlow_Training.ipynb`**
|
| 102 |
+
|
| 103 |
+
It has interactive widgets for:
|
| 104 |
+
- Dataset selection (CIFAR-10, Flowers-102, CelebA, Fashion-MNIST, AFHQ, custom folder)
|
| 105 |
+
- Model size and all hyperparameters
|
| 106 |
+
- Auto batch-size adjustment for your GPU
|
| 107 |
+
|
| 108 |
+
### Command Line
|
| 109 |
+
|
| 110 |
+
```bash
|
| 111 |
+
pip install torch torchvision einops pillow matplotlib tqdm
|
| 112 |
+
|
| 113 |
+
# Quick test (CIFAR-10 32Γ32)
|
| 114 |
+
python liquidflow/train.py --model_size tiny --img_size 32 --dataset cifar10 --epochs 50 --batch_size 64
|
| 115 |
+
|
| 116 |
+
# Production (Flowers 128Γ128)
|
| 117 |
+
python liquidflow/train.py --model_size small --img_size 128 --dataset flowers --epochs 200 --batch_size 16
|
| 118 |
+
|
| 119 |
+
# Custom images
|
| 120 |
+
python liquidflow/train.py --model_size small --img_size 128 --dataset folder --data_dir /path/to/images
|
| 121 |
+
```
|
| 122 |
+
|
| 123 |
+
### Python API
|
| 124 |
+
|
| 125 |
+
```python
|
| 126 |
+
from liquidflow import liquidflow_small, euler_sample, make_grid_image
|
| 127 |
+
import torch
|
| 128 |
+
|
| 129 |
+
model = liquidflow_small(img_size=128) # 13.7M params
|
| 130 |
+
# ... after training ...
|
| 131 |
+
model.eval()
|
| 132 |
+
images = euler_sample(model, (16, 3, 128, 128), num_steps=50, device='cuda')
|
| 133 |
+
grid = make_grid_image(images.clamp(-1,1)*0.5+0.5, nrow=4)
|
| 134 |
+
grid.save('generated.png')
|
| 135 |
+
```
|
| 136 |
+
|
| 137 |
+
## π¦ File Structure
|
| 138 |
+
|
| 139 |
+
```
|
| 140 |
+
βββ liquidflow/
|
| 141 |
+
β βββ __init__.py # Package exports
|
| 142 |
+
β βββ model.py # Core architecture (LiquidFlowNet, LiquidCfCCell, SelectiveSSM)
|
| 143 |
+
β βββ losses.py # Physics-informed flow matching loss + EMA
|
| 144 |
+
β βββ sampling.py # Euler & Heun ODE samplers
|
| 145 |
+
β βββ train.py # Full training script with CLI
|
| 146 |
+
βββ LiquidFlow_Training.ipynb # π Colab/Kaggle notebook
|
| 147 |
+
βββ smoke_test.py # Comprehensive CPU test suite (25 tests)
|
| 148 |
+
βββ README.md
|
| 149 |
+
```
|
| 150 |
+
|
| 151 |
+
## π¬ Physics-Informed Loss
|
| 152 |
+
|
| 153 |
+
```
|
| 154 |
+
L = L_flow + Ξ»_smooth Β· L_smooth + Ξ»_tv Β· L_tv
|
| 155 |
+
```
|
| 156 |
+
|
| 157 |
+
| Term | Formula | Purpose |
|
| 158 |
+
|------|---------|---------|
|
| 159 |
+
| `L_flow` | `βv_ΞΈ(xβ,t) - (xβ-xβ)βΒ²` | Learn straight-line velocity field |
|
| 160 |
+
| `L_smooth` | `ββΒ²x_predβΒ²` (Laplacian) | Penalize high-frequency noise |
|
| 161 |
+
| `L_tv` | `ββx_predββ` (Total Variation) | Edge-preserving smoothness |
|
| 162 |
+
|
| 163 |
+
Physics loss is **warmed up** over the first 500 steps.
|
| 164 |
+
|
| 165 |
+
## π§ͺ Recommended Experiments
|
| 166 |
+
|
| 167 |
+
| Goal | Dataset | Model | Size | Epochs | Time (T4) |
|
| 168 |
+
|------|---------|-------|------|--------|-----------|
|
| 169 |
+
| Sanity check | CIFAR-10 | tiny | 32 | 20 | ~5 min |
|
| 170 |
+
| Baseline | CIFAR-10 | tiny | 128 | 100 | ~2 hrs |
|
| 171 |
+
| Quality | Flowers-102 | small | 128 | 200 | ~4 hrs |
|
| 172 |
+
| Faces | CelebA | small | 128 | 50 | ~6 hrs |
|
| 173 |
+
| High-res | CelebA | 512 | 512 | 100 | ~12 hrs |
|
| 174 |
+
|
| 175 |
+
## π± Mobile Export
|
| 176 |
+
|
| 177 |
+
The notebook includes TorchScript and ONNX export cells. The `tiny` model produces a ~24MB file for on-device inference.
|
| 178 |
+
|
| 179 |
+
## β
Verified (25/25 smoke tests pass)
|
| 180 |
+
|
| 181 |
+
- All 4 model variants: forward pass β
|
| 182 |
+
- Backward pass: all parameters receive gradients β
|
| 183 |
+
- Gradient health: no NaN, no Inf β
|
| 184 |
+
- Loss convergence: finite across optimizer steps β
|
| 185 |
+
- Individual components: LiquidCfCCell, SelectiveSSM, LiquidSSMBlock β
|
| 186 |
+
- Scan patterns: 4 patterns, all invertible β
|
| 187 |
+
- Sampling: Euler + Heun produce finite images β
|
| 188 |
+
- EMA: apply/restore cycle β
|
| 189 |
+
- Checkpoint: save/load round-trip β
|
| 190 |
+
- Physics loss: all terms finite and positive β
|
| 191 |
+
|
| 192 |
+
## π References
|
| 193 |
+
|
| 194 |
+
1. Hasani et al., "Liquid Time-Constant Networks", AAAI 2021 ([2006.04439](https://arxiv.org/abs/2006.04439))
|
| 195 |
+
2. Hasani et al., "Closed-form Continuous-depth Models", Nature MI 2022
|
| 196 |
+
3. Gu & Dao, "Mamba: Linear-Time Sequence Modeling", 2023 ([2312.00752](https://arxiv.org/abs/2312.00752))
|
| 197 |
+
4. Teng et al., "DiM: Diffusion Mamba", 2024 ([2405.14224](https://arxiv.org/abs/2405.14224))
|
| 198 |
+
5. Hu et al., "ZigMa: Zigzag Mamba Diffusion", 2024 ([2403.13802](https://arxiv.org/abs/2403.13802))
|
| 199 |
+
6. Lipman et al., "Flow Matching for Generative Modeling", ICLR 2023
|
| 200 |
+
7. Raissi et al., "Physics-Informed Neural Networks", JCP 2019 ([1711.10561](https://arxiv.org/abs/1711.10561))
|
| 201 |
+
8. Wang et al., "Gradient Pathologies in PINNs", 2020 ([2001.04536](https://arxiv.org/abs/2001.04536))
|
| 202 |
+
9. Bastek & Kochmann, "Physics-Informed Diffusion Models", 2024 ([2403.14404](https://arxiv.org/abs/2403.14404))
|
| 203 |
+
10. Zhu et al., "Vision Mamba", 2024 ([2401.09417](https://arxiv.org/abs/2401.09417))
|
| 204 |
+
|
| 205 |
+
## License
|
| 206 |
+
MIT
|