metadata
license: mit
tags:
- image-generation
- flow-matching
- liquid-neural-networks
- mamba
- state-space-models
- physics-informed
- lightweight
- mobile-friendly
π LiquidFlow β Liquid-SSM Flow Matching Image Generator
A novel lightweight architecture for image generation that combines:
| Component | Source | Role |
|---|---|---|
| Liquid Time-Constant Networks | Hasani et al. 2020 | Adaptive ODE dynamics via CfC closed-form β bounded by construction |
| Selective State Space Models | Gu & Dao 2023 (Mamba) | Linear-time long-range context, parallelizable scanning |
| Zigzag Scanning | ZigMa 2024 | 2D spatial awareness through alternating scan patterns |
| Physics-Informed Loss | Wang et al. 2020, PIDM 2024 | Smoothness + TV regularization for training stability |
| Rectified Flow Matching | Lipman et al. 2022 | ODE-based generation β no noise schedule tuning needed |
π― Key Properties
- Trainable on Google Colab free tier (T4 16GB) and Kaggle
- Mobile-deployable β tiny model is only
6M params (24MB) - No custom CUDA kernels β pure PyTorch, runs anywhere
- No training collapse/explosion β sigmoid gating in Liquid CfC guarantees bounded dynamics
- No noise schedule tuning β flow matching uses simple linear interpolation
π Architecture
Noise xβ ~ N(0,I) βββ LiquidFlow v_ΞΈ(xβ, t) βββ Image xβ
β
ββββββββ΄βββββββ
β Patchify β (image β non-overlapping patches)
β + PosEmb β (2D learnable positions)
β + DepthConvβ (local structure preservation)
ββββββββ¬βββββββ
β
ββββββββββββββΌβββββββββββββ
β L Γ LiquidSSM Block β
β ββββββββββββββββββββ β
β β AdaLN (t-cond) β β β DiT-style conditioning
β β Zigzag Scan β β β rotates scan pattern per layer
β β SelectiveSSM β β β Mamba-style, input-dependent A,B,C,Ξ
β β + LiquidCfC β β β CfC gating: Ο(-f_Ο)βh + (1-Ο(-f_Ο))βf_x
β β + FFN β β β GELU feed-forward
β β + Skip Connect β β β U-Net style long skips
β ββββββββββββββββββββ β
ββββββββββββββΌβββββββββββββ
β
ββββββββ΄βββββββ
β DepthConv β (local refinement)
β Unpatchify β (patches β image)
ββββββββ¬βββββββ
β
velocity v_ΞΈ (same shape as input)
Core Innovation: Liquid CfC Cell
Instead of solving the Liquid ODE numerically (sequential, slow):
dx/dt = -[1/Ο + f(x,I,t)] * x + f(x,I,t)
We use the Closed-form Continuous-depth (CfC) solution (parallel, fast, stable):
gate = sigmoid(-f_tau(x, h)) # time-constant gating
new_h = gate * h + (1 - gate) * f_x(x, h) # bounded update
The sigmoid gating guarantees that hidden states stay bounded β no explosion or collapse possible by construction.
Dual-Path Processing
Each LiquidSSM Block has two parallel branches:
- SSM Branch: Selective scan (Mamba-style) with zigzag patterns β captures global spatial dependencies
- Liquid Branch: CfC cell β adds continuous-time adaptive dynamics
A learnable mixing coefficient Ξ± balances them: output = Ξ±Β·SSM + (1-Ξ±)Β·Liquid
π Model Variants
| Variant | Params | Image Size | Patch | GPU VRAM (bs=16) | Use Case |
|---|---|---|---|---|---|
tiny |
5.9M | 128Γ128 | 4 | ~4 GB | Quick experiments, mobile |
small |
13.7M | 128Γ128 | 4 | ~8 GB | Production 128Γ128 |
base |
37.6M | 256Γ256 | 8 | ~12 GB | High quality |
512 |
38.1M | 512Γ512 | 16 | ~14 GB | High resolution |
π Quick Start
Colab / Kaggle (Recommended)
Open the notebook: LiquidFlow_Training.ipynb
It has interactive widgets for:
- Dataset selection (CIFAR-10, Flowers-102, CelebA, Fashion-MNIST, AFHQ, custom folder)
- Model size and all hyperparameters
- Auto batch-size adjustment for your GPU
Command Line
pip install torch torchvision einops pillow matplotlib tqdm
# Quick test (CIFAR-10 32Γ32)
python liquidflow/train.py --model_size tiny --img_size 32 --dataset cifar10 --epochs 50 --batch_size 64
# Production (Flowers 128Γ128)
python liquidflow/train.py --model_size small --img_size 128 --dataset flowers --epochs 200 --batch_size 16
# Custom images
python liquidflow/train.py --model_size small --img_size 128 --dataset folder --data_dir /path/to/images
Python API
from liquidflow import liquidflow_small, euler_sample, make_grid_image
import torch
model = liquidflow_small(img_size=128) # 13.7M params
# ... after training ...
model.eval()
images = euler_sample(model, (16, 3, 128, 128), num_steps=50, device='cuda')
grid = make_grid_image(images.clamp(-1,1)*0.5+0.5, nrow=4)
grid.save('generated.png')
π¦ File Structure
βββ liquidflow/
β βββ __init__.py # Package exports
β βββ model.py # Core architecture (LiquidFlowNet, LiquidCfCCell, SelectiveSSM)
β βββ losses.py # Physics-informed flow matching loss + EMA
β βββ sampling.py # Euler & Heun ODE samplers
β βββ train.py # Full training script with CLI
βββ LiquidFlow_Training.ipynb # π Colab/Kaggle notebook
βββ smoke_test.py # Comprehensive CPU test suite (25 tests)
βββ README.md
π¬ Physics-Informed Loss
L = L_flow + Ξ»_smooth Β· L_smooth + Ξ»_tv Β· L_tv
| Term | Formula | Purpose |
|---|---|---|
L_flow |
βv_ΞΈ(xβ,t) - (xβ-xβ)βΒ² |
Learn straight-line velocity field |
L_smooth |
ββΒ²x_predβΒ² (Laplacian) |
Penalize high-frequency noise |
L_tv |
ββx_predββ (Total Variation) |
Edge-preserving smoothness |
Physics loss is warmed up over the first 500 steps.
π§ͺ Recommended Experiments
| Goal | Dataset | Model | Size | Epochs | Time (T4) |
|---|---|---|---|---|---|
| Sanity check | CIFAR-10 | tiny | 32 | 20 | ~5 min |
| Baseline | CIFAR-10 | tiny | 128 | 100 | ~2 hrs |
| Quality | Flowers-102 | small | 128 | 200 | ~4 hrs |
| Faces | CelebA | small | 128 | 50 | ~6 hrs |
| High-res | CelebA | 512 | 512 | 100 | ~12 hrs |
π± Mobile Export
The notebook includes TorchScript and ONNX export cells. The tiny model produces a ~24MB file for on-device inference.
β Verified (25/25 smoke tests pass)
- All 4 model variants: forward pass β
- Backward pass: all parameters receive gradients β
- Gradient health: no NaN, no Inf β
- Loss convergence: finite across optimizer steps β
- Individual components: LiquidCfCCell, SelectiveSSM, LiquidSSMBlock β
- Scan patterns: 4 patterns, all invertible β
- Sampling: Euler + Heun produce finite images β
- EMA: apply/restore cycle β
- Checkpoint: save/load round-trip β
- Physics loss: all terms finite and positive β
π References
- Hasani et al., "Liquid Time-Constant Networks", AAAI 2021 (2006.04439)
- Hasani et al., "Closed-form Continuous-depth Models", Nature MI 2022
- Gu & Dao, "Mamba: Linear-Time Sequence Modeling", 2023 (2312.00752)
- Teng et al., "DiM: Diffusion Mamba", 2024 (2405.14224)
- Hu et al., "ZigMa: Zigzag Mamba Diffusion", 2024 (2403.13802)
- Lipman et al., "Flow Matching for Generative Modeling", ICLR 2023
- Raissi et al., "Physics-Informed Neural Networks", JCP 2019 (1711.10561)
- Wang et al., "Gradient Pathologies in PINNs", 2020 (2001.04536)
- Bastek & Kochmann, "Physics-Informed Diffusion Models", 2024 (2403.14404)
- Zhu et al., "Vision Mamba", 2024 (2401.09417)
License
MIT