license: apache-2.0
library_name: pytorch
tags:
- depth-estimation
- 3d-reconstruction
- multi-view
- camera-pose
- gaussian-splatting
- depth-anything-3
- vggt
pipeline_tag: depth-estimation
MDA โ Multi-view depth & geometry checkpoints
These are the official model checkpoints for the paper "Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation" (MDA).
๐ arXiv | ๐ Project page
MDA is a mixture-density depth representation that predicts several depth hypotheses (with their probabilities) at every pixel instead of forcing a single depth, which largely removes the flying-point artifacts at object boundaries that plague feed-forward depth estimators. See the Citation section to cite this work.
These two checkpoints are used for multi-view geometry prediction โ
spatially consistent depth and camera pose from a set of input images. They are
built on two different backbones and trained with a Mixture-of-Gaussians (MoG)
depth head and a logl2 objective.
| File | Backbone | Wrapper | model_choice.py name |
Params |
|---|---|---|---|---|
DA3_MOG_Sky_LogL2.ckpt |
DA3 Giant | DA3Wrapper |
mda_mog_sky_l2 |
~1.36 B |
VGGT_MOG_LogL2.ckpt |
VGGT-1B | VGGTWrapper |
vggt_mog_l2 |
~1.16 B |
Both are PyTorch Lightning checkpoints (save_weights_only=True, Lightning 2.5.6).
State-dict keys are prefixed net.net.* because the network is wrapped by a
Lightning module โ strip the prefix and load into the bare net. These are research checkpoints and are not loadable
through the standard DepthAnything3.from_pretrained HF API.
Citation
If you build on MDA, please cite:
@misc{bian2026modeling,
title = {Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation},
author = {Siyuan Bian and Congrong Xu and Jun Gao},
year = {2026},
eprint = {2606.02552},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2606.02552}
}