Progressive Flow Matching for 2D Gaussian Image Generation from a Single Image

Author: Xi Hu (u8200077, ANU) Course: COMP/ENGN 6528 Computer Vision (S1 2026), ANU

Pre-trained weights for the mini-project submission. The method learns a flow-matching prior over 2D Gaussian primitives produced by a single-image VAE encoder (frozen DINOv2 + FiLM decoder), then performs a two-step render-conditioned cascade for sampling. A short 1-step adversarial fine-tune doubles output diversity (LPIPS 0.055 -> 0.118).

Files

File Size What it is
vae_checkpoint.pth 651 MB Stage 1 VAE (frozen DINOv2 + 8-block FiLM decoder, 256-d latent)
flow_model.pth 173 MB Stage 2 flow U-Net (45M params, MSE-only, seed 0, 100k iters)
flow_model_seed1.pth 173 MB Multi-seed reproducibility run, seed 1 (50k iters)
flow_model_seed2.pth 173 MB Multi-seed reproducibility run, seed 2 (50k iters)
flow_model_gan.pth 173 MB GAN-fine-tuned flow (1-step x_hat trick, 5k iters)
discriminator_adv.pth 2.6 MB PatchGAN discriminator used during adversarial fine-tune
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support