RAGNet / README.md
wangzeze's picture
Upload folder using huggingface_hub
0453c63 verified

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Dongming Wu, Yanping Fu, Saike Huang, Yingfei Liu, Fan Jia, Nian Liu, Feng Dai, Tiancai Wang, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jianbing Shen

πŸ“ TL;DR

  • To push forward general robotic grasping, we introduce a large-scale reasoning-based affordance segmentation benchmark, RAGNet. It contains 273k images, 180 categories, and 26k reasoning instructions.
  • Furthermore, we propose a comprehensive affordance-based grasping framework, named AffordanceNet, which consists of a VLM (named AffordanceVLM) pre-trained on our massive affordance data and a grasping network that conditions an affordance map to grasp the target.

πŸ“° News

  • [2025.08] Paper is released at arXiv.
  • [2025.07] Inference code and the AffordanceVLM model are released. Welcome to try it!
  • [2025.06] Paper is accepted by ICCV 2025!

πŸš€ Getting Started

πŸ“Š Main Results

πŸ”Ή Affordance Segmentation

Method HANDAL gIoU HANDAL cIoU HANDAL† gIoU HANDAL† cIoU GraspNet seen gIoU GraspNet seen cIoU GraspNet novel gIoU GraspNet novel cIoU 3DOI gIoU 3DOI cIoU
AffordanceNet 60.3 60.8 60.5 60.3 63.3 64.0 45.6 33.2 37.4 37.4

πŸ”Έ Reasoning-Based Affordance Segmentation

Method HANDAL (easy) gIoU HANDAL (easy) cIoU HANDAL (hard) gIoU HANDAL (hard) cIoU 3DOI gIoU 3DOI cIoU
AffordanceNet 58.3 58.1 58.2 57.8 38.1 39.4

πŸ“š Citation

If you find our work useful, please consider citing:

@inproceedings{wu2025ragnet,
  title={RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping},
  author={Wu, Dongming and Fu, Yanping and Huang, Saike and Liu, Yingfei and Jia, Fan and Liu, Nian and Dai, Feng and Wang, Tiancai and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and others},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={11980--11990},
  year={2025}
}

πŸ™ Acknowledgements

We thank the authors that open the following projects.