Improve model card
Browse filesHi! I'm Niels, part of the community science team at Hugging Face. I've opened this PR to improve the model card for DyFN.
This PR includes:
- Added the `pipeline_tag: depth-estimation` to the metadata.
- Included links to the paper on Hugging Face Papers, the project page, and the GitHub repository.
- Added a brief model description and the official BibTeX citation.
These changes help users discover, understand, and cite your work more easily. Please let me know if you have any questions!
README.md
CHANGED
|
@@ -1,26 +1,38 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
tags:
|
| 4 |
-
- depth-estimation
|
| 5 |
- video-depth
|
| 6 |
- monocular-geometry
|
| 7 |
- streaming
|
| 8 |
---
|
| 9 |
|
| 10 |
-
# DyFN
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
- **File:** `DyFN.pt`
|
| 15 |
- **Parameters:** ~320M
|
| 16 |
- **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
|
| 17 |
-
- **Code:** [shawLyu/Streaming_DyFN](https://github.com/shawLyu/Streaming_DyFN)
|
| 18 |
|
| 19 |
## Usage
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
```python
|
| 22 |
from moge.model.v1 import MoGeModel
|
| 23 |
|
|
|
|
| 24 |
model = MoGeModel.from_pretrained("shawlyu/DyFN")
|
| 25 |
```
|
| 26 |
|
|
@@ -29,3 +41,24 @@ Or pass a local path:
|
|
| 29 |
```python
|
| 30 |
model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
|
| 31 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: depth-estimation
|
| 4 |
tags:
|
|
|
|
| 5 |
- video-depth
|
| 6 |
- monocular-geometry
|
| 7 |
- streaming
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# DyFN: Stabilizing Streaming Video Geometry via Dynamic Feature Normalization
|
| 11 |
|
| 12 |
+
This repository contains the pretrained checkpoint for **DyFN**, a model designed for consistent 3D geometry estimation from streaming RGB input.
|
| 13 |
+
|
| 14 |
+
[**Paper**](https://huggingface.co/papers/2605.25308) | [**Project Page**](https://shawlyu.github.io/DyFN) | [**Code**](https://github.com/shawLyu/Streaming_DyFN)
|
| 15 |
+
|
| 16 |
+
## Description
|
| 17 |
+
Dynamic Feature Normalization (DyFN) is a lightweight, causal recurrent module that dynamically and robustly modulates feature statistics to maintain stable geometry over time. By finetuning only DyFN (a mere 2% additional parameters) on pretrained monocular geometry models, it effectively eliminates temporal artifacts such as disjointed layering and positional jitter without compromising single-image accuracy.
|
| 18 |
|
| 19 |
- **File:** `DyFN.pt`
|
| 20 |
- **Parameters:** ~320M
|
| 21 |
- **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
|
|
|
|
| 22 |
|
| 23 |
## Usage
|
| 24 |
|
| 25 |
+
To use this model, you can install the package via:
|
| 26 |
+
```bash
|
| 27 |
+
pip install git+https://github.com/shawLyu/Streaming_DyFN.git
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
Then, load the model with the following snippet:
|
| 31 |
+
|
| 32 |
```python
|
| 33 |
from moge.model.v1 import MoGeModel
|
| 34 |
|
| 35 |
+
# Load from Hugging Face Hub
|
| 36 |
model = MoGeModel.from_pretrained("shawlyu/DyFN")
|
| 37 |
```
|
| 38 |
|
|
|
|
| 41 |
```python
|
| 42 |
model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
|
| 43 |
```
|
| 44 |
+
|
| 45 |
+
## Citation
|
| 46 |
+
|
| 47 |
+
If you find this project useful in your research, please cite:
|
| 48 |
+
|
| 49 |
+
```bibtex
|
| 50 |
+
@inproceedings{lyu2026streamingdepth,
|
| 51 |
+
title={Stabilizing Streaming Video Geometry via Dynamic Feature Normalization},
|
| 52 |
+
author={Lyu, Xiaoyang and Liu, Muxin and Wu, Xiaoshan and Wang, Ruicheng and Huang, Yi-Hua and Sun, Yang-Tian and Shi, Shaoshuai and Qi, Xiaojuan},
|
| 53 |
+
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
|
| 54 |
+
year={2026}
|
| 55 |
+
}
|
| 56 |
+
|
| 57 |
+
@inproceedings{wang2025moge,
|
| 58 |
+
title={Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision},
|
| 59 |
+
author={Wang, Ruicheng and Xu, Sicheng and Dai, Cassie and Xiang, Jianfeng and Deng, Yu and Tong, Xin and Yang, Jiaolong},
|
| 60 |
+
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
|
| 61 |
+
pages={5261--5271},
|
| 62 |
+
year={2025}
|
| 63 |
+
}
|
| 64 |
+
```
|