nielsr HF Staff commited on
Commit
79c747f
·
verified ·
1 Parent(s): d447423

Improve model card: Add pipeline tag, library name, and links

Browse files

This PR significantly improves the model card for `MomaGraph-R1` by making the following updates:

- Adds `pipeline_tag: image-text-to-text` to the metadata, enabling better discoverability and the inference widget on the Hugging Face Hub.
- Adds `library_name: transformers` to the metadata, as indicated by the presence of `transformers_version` and `Qwen2_5_VLForConditionalGeneration` in `config.json`. This will enable the automated "how to use" code snippet.
- Adds a comprehensive model description based on the paper abstract.
- Includes links to the official paper, project page, and GitHub repository in the model card content.
- Adds a "Usage" section that directs users to the GitHub repository for code examples and a "Citation" section.

Please review and merge if these improvements align with the project goals.

Files changed (1) hide show
  1. README.md +36 -3
README.md CHANGED
@@ -1,3 +1,36 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-text-to-text
4
+ library_name: transformers
5
+ ---
6
+
7
+ # MomaGraph-R1
8
+
9
+ This repository contains **MomaGraph-R1**, a 7B vision-language model presented in the paper [MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning](https://huggingface.co/papers/2512.16909).
10
+
11
+ MomaGraph-R1 introduces a unified scene representation for embodied agents, integrating spatial-functional relationships and part-level interactive elements to address the needs of mobile manipulators in household environments. Trained with reinforcement learning on the MomaGraph-Scenes dataset, MomaGraph-R1 predicts task-oriented scene graphs and serves as a zero-shot task planner under a Graph-then-Plan framework.
12
+
13
+ It achieves state-of-the-art results among open-source models, reaching 71.6% accuracy on the MomaGraph-Bench benchmark (+11.4% over the best baseline), while generalizing across public benchmarks and transferring effectively to real-robot experiments.
14
+
15
+ * **Project Page:** https://hybridrobotics.github.io/MomaGraph/
16
+ * **Code:** https://github.com/HybridRobotics/MomaGraph
17
+
18
+ ## Usage
19
+
20
+ This model is compatible with the Hugging Face `transformers` library. For detailed usage instructions and code examples, please refer to the official [GitHub repository](https://github.com/HybridRobotics/MomaGraph).
21
+
22
+ ## Citation
23
+
24
+ If you find our work helpful or inspiring, please consider citing the paper:
25
+
26
+ ```bibtex
27
+ @misc{ju2025momagraph,
28
+ title={MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning},
29
+ author={Yuanchen Ju and Yongyuan Liang and Yen-Jen Wang and Nandiraju Gireesh and Yuanliang Ju and Seungjae Lee and Qiao Gu and Elvis Hsieh and Furong Huang and Koushil Sreenath},
30
+ year={2025},
31
+ eprint={2512.16909},
32
+ archivePrefix={arXiv},
33
+ primaryClass={cs.RO},
34
+ url={https://arxiv.org/abs/2512.16909},
35
+ }
36
+ ```