Progressive Flow Matching for 2D Gaussian Image Generation from a Single Image
Author: Xi Hu (u8200077, ANU) Course: COMP/ENGN 6528 Computer Vision (S1 2026), ANU
Pre-trained weights for the mini-project submission. The method learns a flow-matching prior over 2D Gaussian primitives produced by a single-image VAE encoder (frozen DINOv2 + FiLM decoder), then performs a two-step render-conditioned cascade for sampling. A short 1-step adversarial fine-tune doubles output diversity (LPIPS 0.055 -> 0.118).
Files
| File | Size | What it is |
|---|---|---|
vae_checkpoint.pth |
651 MB | Stage 1 VAE (frozen DINOv2 + 8-block FiLM decoder, 256-d latent) |
flow_model.pth |
173 MB | Stage 2 flow U-Net (45M params, MSE-only, seed 0, 100k iters) |
flow_model_seed1.pth |
173 MB | Multi-seed reproducibility run, seed 1 (50k iters) |
flow_model_seed2.pth |
173 MB | Multi-seed reproducibility run, seed 2 (50k iters) |
flow_model_gan.pth |
173 MB | GAN-fine-tuned flow (1-step x_hat trick, 5k iters) |
discriminator_adv.pth |
2.6 MB | PatchGAN discriminator used during adversarial fine-tune |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support