Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

DA3_MOG_Sky_LogL2.ckpt +3 -0
README.md +57 -0
VGGT_MOG_LogL2.ckpt +3 -0

DA3_MOG_Sky_LogL2.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d442e856a6be8832aac72a0f17293ae98af75a5bd8d21adfbe337bd915e41d83
+size 5423360059

README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+---
+license: apache-2.0
+library_name: pytorch
+tags:
+  - depth-estimation
+  - 3d-reconstruction
+  - multi-view
+  - camera-pose
+  - gaussian-splatting
+  - depth-anything-3
+  - vggt
+pipeline_tag: depth-estimation
+---
+# MDA — Multi-view depth & geometry checkpoints
+These are the official model checkpoints for the paper
+**"Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation"** (MDA).
+📄 [arXiv](https://arxiv.org/abs/2606.02552) &nbsp;|&nbsp; 🌐 [Project page](https://biansy000.github.io/mda-site/)
+MDA is a mixture-density depth representation that predicts several depth
+hypotheses (with their probabilities) at every pixel instead of forcing a single
+depth, which largely removes the *flying-point* artifacts at object boundaries
+that plague feed-forward depth estimators. See the [Citation](#citation) section
+to cite this work.
+These two checkpoints are used for multi-view geometry prediction —
+spatially consistent depth and camera pose from a set of input images. They are
+built on two different backbones and trained with a Mixture-of-Gaussians (MoG)
+depth head and a `logl2` objective.
+| File | Backbone | Wrapper | `model_choice.py` name | Params |
+|---|---|---|---|---|
+| [`DA3_MOG_Sky_LogL2.ckpt`](./DA3_MOG_Sky_LogL2.ckpt) | DA3 Giant | `DA3Wrapper` | `mda_mog_sky_l2` | ~1.36 B |
+| [`VGGT_MOG_LogL2.ckpt`](./VGGT_MOG_LogL2.ckpt) | VGGT-1B | `VGGTWrapper` | `vggt_mog_l2` | ~1.16 B |
+Both are PyTorch Lightning checkpoints (`save_weights_only=True`, Lightning 2.5.6).
+State-dict keys are prefixed `net.net.*` because the network is wrapped by a
+Lightning module — strip the prefix and load into the bare net. These are **research checkpoints** and are **not** loadable
+through the standard `DepthAnything3.from_pretrained` HF API.
+## Citation
+If you build on **MDA**, please cite:
+```bibtex
+@misc{bian2026modeling,
+  title         = {Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation},
+  author        = {Siyuan Bian and Congrong Xu and Jun Gao},
+  year          = {2026},
+  eprint        = {2606.02552},
+  archivePrefix = {arXiv},
+  primaryClass  = {cs.CV},
+  url           = {https://arxiv.org/abs/2606.02552}
+}
+```

VGGT_MOG_LogL2.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d82e60abcf9f26dca7ac0056d5f5db893e6e6827f0dd7e242bcff93c079a4177
+size 4632792671