Improve model card with metadata, links, and usage
Browse filesHi! I'm Niels from the community science team at Hugging Face. I'm opening this PR to enhance the model card for GeoMotion.
Improvements include:
- Added `pipeline_tag: image-segmentation` to the metadata to improve discoverability on the Hub.
- Included links to the research paper and the official GitHub repository.
- Added a brief description of the model's architecture and approach.
- Provided a sample usage snippet for inference based on the instructions in the repository.
- Added the BibTeX citation for the work.
README.md
CHANGED
|
@@ -1,3 +1,48 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
pipeline_tag: image-segmentation
|
| 4 |
+
tags:
|
| 5 |
+
- motion-segmentation
|
| 6 |
+
- 4d-geometry
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
# GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry
|
| 10 |
+
|
| 11 |
+
GeoMotion is a fully learning-based motion segmentation framework that directly infers dynamic masks from latent 4D geometry. It combines 4D geometric priors from a pretrained reconstruction model ($\pi^3$) with local pixel-level motion from optical flow, enabling the disentanglement of object motion from camera motion in a single feed-forward pass.
|
| 12 |
+
|
| 13 |
+
- **Paper:** [GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry](https://huggingface.co/papers/2602.21810)
|
| 14 |
+
- **GitHub Repository:** [zjutcvg/GeoMotion](https://github.com/zjutcvg/GeoMotion)
|
| 15 |
+
|
| 16 |
+
## Introduction
|
| 17 |
+
|
| 18 |
+
Motion segmentation in dynamic scenes is challenging because traditional methods rely on noisy camera pose estimation and point correspondences. GeoMotion bypasses explicit correspondence estimation, learning instead to implicitly disentangle object and camera motion via attention mechanisms. Supported by recent advances in 4D scene geometry reconstruction, the proposed method leverages reliable camera poses and rich spatial-temporal priors to achieve state-of-the-art performance with high efficiency.
|
| 19 |
+
|
| 20 |
+
## Sample Usage
|
| 21 |
+
|
| 22 |
+
To perform motion segmentation on a single sequence, you can use the provided inference script. Ensure you have the environment set up and the required checkpoints downloaded as described in the [official repository](https://github.com/zjutcvg/GeoMotion).
|
| 23 |
+
|
| 24 |
+
```bash
|
| 25 |
+
python motion_seg_inference.py \
|
| 26 |
+
--model_path checkpoint/best_model.pth \
|
| 27 |
+
--pi3_model_path checkpoint/model.safetensors \
|
| 28 |
+
--input_dir data/test/<sequence_name> \
|
| 29 |
+
--output_dir output/single \
|
| 30 |
+
--sequence_length 32 \
|
| 31 |
+
--threshold 0.5
|
| 32 |
+
```
|
| 33 |
+
|
| 34 |
+
## Citation
|
| 35 |
+
|
| 36 |
+
If you find GeoMotion useful for your research, please cite the following:
|
| 37 |
+
|
| 38 |
+
```bibtex
|
| 39 |
+
@misc{he2026geomotion,
|
| 40 |
+
title={GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry},
|
| 41 |
+
author={Xiankang He and Peile Lin and Ying Cui and Dongyan Guo and Chunhua Shen and Xiaoqin Zhang},
|
| 42 |
+
year={2026},
|
| 43 |
+
eprint={2602.21810},
|
| 44 |
+
archivePrefix={arXiv},
|
| 45 |
+
primaryClass={cs.CV},
|
| 46 |
+
url={https://arxiv.org/abs/2602.21810},
|
| 47 |
+
}
|
| 48 |
+
```
|