Improve model card

Hi! I'm Niels, part of the community science team at Hugging Face. I've opened this PR to improve the model card for DyFN.

This PR includes:
- Added the `pipeline_tag: depth-estimation` to the metadata.
- Included links to the paper on Hugging Face Papers, the project page, and the GitHub repository.
- Added a brief model description and the official BibTeX citation.

These changes help users discover, understand, and cite your work more easily. Please let me know if you have any questions!

Files changed (1) hide show

README.md +37 -4

README.md CHANGED Viewed

@@ -1,26 +1,38 @@
 ---
 license: mit
 tags:
-- depth-estimation
 - video-depth
 - monocular-geometry
 - streaming
 ---
-# DyFN
-Pretrained checkpoint for **Stabilizing Streaming Video Geometry via Dynamic Feature Normalization**.
 - **File:** `DyFN.pt`
 - **Parameters:** ~320M
 - **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
-- **Code:** [shawLyu/Streaming_DyFN](https://github.com/shawLyu/Streaming_DyFN)
 ## Usage
 ```python
 from moge.model.v1 import MoGeModel
 model = MoGeModel.from_pretrained("shawlyu/DyFN")
 ```
@@ -29,3 +41,24 @@ Or pass a local path:
 ```python
 model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
 ```

 ---
 license: mit
+pipeline_tag: depth-estimation
 tags:
 - video-depth
 - monocular-geometry
 - streaming
 ---
+# DyFN: Stabilizing Streaming Video Geometry via Dynamic Feature Normalization
+This repository contains the pretrained checkpoint for **DyFN**, a model designed for consistent 3D geometry estimation from streaming RGB input.
+[**Paper**](https://huggingface.co/papers/2605.25308) | [**Project Page**](https://shawlyu.github.io/DyFN) | [**Code**](https://github.com/shawLyu/Streaming_DyFN)
+## Description
+Dynamic Feature Normalization (DyFN) is a lightweight, causal recurrent module that dynamically and robustly modulates feature statistics to maintain stable geometry over time. By finetuning only DyFN (a mere 2% additional parameters) on pretrained monocular geometry models, it effectively eliminates temporal artifacts such as disjointed layering and positional jitter without compromising single-image accuracy.
 - **File:** `DyFN.pt`
 - **Parameters:** ~320M
 - **Base:** MoGe-ViT-L with ConvGRU temporal stabilizer
 ## Usage
+To use this model, you can install the package via:
+```bash
+pip install git+https://github.com/shawLyu/Streaming_DyFN.git
+```
+Then, load the model with the following snippet:
 ```python
 from moge.model.v1 import MoGeModel
+# Load from Hugging Face Hub
 model = MoGeModel.from_pretrained("shawlyu/DyFN")
 ```
 ```python
 model = MoGeModel.from_pretrained("./pretrained/DyFN.pt")
 ```
+## Citation
+If you find this project useful in your research, please cite:
+```bibtex
+@inproceedings{lyu2026streamingdepth,
+  title={Stabilizing Streaming Video Geometry via Dynamic Feature Normalization},
+  author={Lyu, Xiaoyang and Liu, Muxin and Wu, Xiaoshan and Wang, Ruicheng and Huang, Yi-Hua and Sun, Yang-Tian and Shi, Shaoshuai and Qi, Xiaojuan},
+  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+  year={2026}
+}
+@inproceedings{wang2025moge,
+  title={Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision},
+  author={Wang, Ruicheng and Xu, Sicheng and Dai, Cassie and Xiang, Jianfeng and Deng, Yu and Tong, Xin and Yang, Jiaolong},
+  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
+  pages={5261--5271},
+  year={2025}
+}
+```