File size: 2,245 Bytes
ca42394
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
library_name: pytorch
tags:
  - depth-estimation
  - 3d-reconstruction
  - multi-view
  - camera-pose
  - gaussian-splatting
  - depth-anything-3
  - vggt
pipeline_tag: depth-estimation
---

# MDA โ€” Multi-view depth & geometry checkpoints

These are the official model checkpoints for the paper
**"Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation"** (MDA).

๐Ÿ“„ [arXiv](https://arxiv.org/abs/2606.02552)  |  ๐ŸŒ [Project page](https://biansy000.github.io/mda-site/)

MDA is a mixture-density depth representation that predicts several depth
hypotheses (with their probabilities) at every pixel instead of forcing a single
depth, which largely removes the *flying-point* artifacts at object boundaries
that plague feed-forward depth estimators. See the [Citation](#citation) section
to cite this work.

These two checkpoints are used for multi-view geometry prediction โ€”
spatially consistent depth and camera pose from a set of input images. They are
built on two different backbones and trained with a Mixture-of-Gaussians (MoG)
depth head and a `logl2` objective.

| File | Backbone | Wrapper | `model_choice.py` name | Params |
|---|---|---|---|---|
| [`DA3_MOG_Sky_LogL2.ckpt`](./DA3_MOG_Sky_LogL2.ckpt) | DA3 Giant | `DA3Wrapper` | `mda_mog_sky_l2` | ~1.36 B |
| [`VGGT_MOG_LogL2.ckpt`](./VGGT_MOG_LogL2.ckpt) | VGGT-1B | `VGGTWrapper` | `vggt_mog_l2` | ~1.16 B |

Both are PyTorch Lightning checkpoints (`save_weights_only=True`, Lightning 2.5.6).
State-dict keys are prefixed `net.net.*` because the network is wrapped by a
Lightning module โ€” strip the prefix and load into the bare net. These are **research checkpoints** and are **not** loadable
through the standard `DepthAnything3.from_pretrained` HF API.

## Citation

If you build on **MDA**, please cite:

```bibtex
@misc{bian2026modeling,
  title         = {Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation},
  author        = {Siyuan Bian and Congrong Xu and Jun Gao},
  year          = {2026},
  eprint        = {2606.02552},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2606.02552}
}
```