shawlyu
/

DyFN

Depth Estimation

monocular-geometry

Model card Files Files and versions

DyFN / README.md

shawlyu's picture

Improve model card (#1)

af5eb5e 2 days ago

|

history blame contribute delete

2.33 kB

	---
	license: mit
	pipeline_tag: depth-estimation
	tags:
	- video-depth
	- monocular-geometry
	- streaming
	---

	# DyFN: Stabilizing Streaming Video Geometry via Dynamic Feature Normalization

	This repository contains the pretrained checkpoint for DyFN, a model designed for consistent 3D geometry estimation from streaming RGB input.

	[Paper](https://huggingface.co/papers/2605.25308) \| [Project Page](https://shawlyu.github.io/DyFN) \| [Code](https://github.com/shawLyu/Streaming_DyFN)

	## Description
	Dynamic Feature Normalization (DyFN) is a lightweight, causal recurrent module that dynamically and robustly modulates feature statistics to maintain stable geometry over time. By finetuning only DyFN (a mere 2% additional parameters) on pretrained monocular geometry models, it effectively eliminates temporal artifacts such as disjointed layering and positional jitter without compromising single-image accuracy.

	- File: `DyFN.pt`
	- Parameters: ~320M
	- Base: MoGe-ViT-L with ConvGRU temporal stabilizer

	## Usage

	To use this model, you can install the package via:
	```bash
	pip install git+https://github.com/shawLyu/Streaming_DyFN.git
	```

	Then, load the model with the following snippet:

	```python
	from moge.model.v1 import MoGeModel

	# Load from Hugging Face Hub
	model = MoGeModel.from_pretrained("shawlyu/DyFN")
	```

	Or pass a local path:

	```python
	model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
	```

	## Citation

	If you find this project useful in your research, please cite:

	```bibtex
	@inproceedings{lyu2026streamingdepth,
	title={Stabilizing Streaming Video Geometry via Dynamic Feature Normalization},
	author={Lyu, Xiaoyang and Liu, Muxin and Wu, Xiaoshan and Wang, Ruicheng and Huang, Yi-Hua and Sun, Yang-Tian and Shi, Shaoshuai and Qi, Xiaojuan},
	booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
	year={2026}
	}

	@inproceedings{wang2025moge,
	title={Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision},
	author={Wang, Ruicheng and Xu, Sicheng and Dai, Cassie and Xiang, Jianfeng and Deng, Yu and Tong, Xin and Yang, Jiaolong},
	booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
	pages={5261--5271},
	year={2025}
	}
	```