thxplz
/

HOI-R1_Qwen2.5-VL-3B-Instruct

Image-Text-to-Text

Model card Files Files and versions

thxplz commited on 22 days ago

Commit

5a3a1dc

·

verified ·

1 Parent(s): 0ee3340

Create README.md

Files changed (1) hide show

README.md +23 -0

README.md ADDED Viewed

	@@ -0,0 +1,23 @@

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen2.5-VL-3B-Instruct
+pipeline_tag: image-text-to-text
+---
+# HOI-R1: Exploring the Potential of Multimodal Large Language Models for Human-Object Interaction Detection
+[paper](https://arxiv.org/abs/2510.05609)
+![hoi-r1-arch](https://cdn-uploads.huggingface.co/production/uploads/63119ce2fb65b9a3e2f75e3c/tHYWwrnqBAHsoo8lIOtnM.jpeg)
+## Reference
+```text
+@article{chen2025hoi,
+  title={HOI-R1: Exploring the Potential of Multimodal Large Language Models for Human-Object Interaction Detection},
+  author={Chen, Junwen and Xiong, Peilin and Yanai, Keiji},
+  journal={arXiv preprint arXiv:2510.05609},
+  year={2025}
+}
+```