GLD: Geometric Latent Diffusion

Repurposing Geometric Foundation Models for Multi-view Diffusion

[Project Page] | [Code]

Quick Start

git clone https://github.com/cvlab-kaist/GLD.git
cd GLD
conda env create -f environment.yml
conda activate gld

# Download all checkpoints
python -c "from huggingface_hub import snapshot_download; snapshot_download('SeonghuJeon/GLD', local_dir='.')"

# Run demo
./run_demo.sh da3

Files

File Description Params Size
checkpoints/da3_level1.pt DA3 Level-1 diffusion (EMA) 783M 2.9G
checkpoints/da3_cascade.pt DA3 Cascade: L1โ†’L0 (EMA) 473M 1.8G
checkpoints/vggt_level1.pt VGGT Level-1 diffusion (EMA) 806M 3.0G
checkpoints/vggt_cascade.pt VGGT Cascade: L1โ†’L0 (EMA) 806M 3.0G
pretrained_models/da3/model.safetensors DA3-Base encoder 135M 0.5G
pretrained_models/da3/dpt_decoder.pt DPT decoder (depth + geometry) - 1.1G
pretrained_models/mae_decoder.pt DA3 MAE decoder (EMA, decoder-only) 423M 1.6G
pretrained_models/vggt/mae_decoder.pt VGGT MAE decoder (EMA, decoder-only) 425M 1.6G

Stage-2 and MAE decoder checkpoints contain EMA weights only. MAE decoder checkpoints contain decoder weights only (encoder removed).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using SeonghuJeon/GLD 1