glab-caltech
/

VALOR-GroundingDINO

Object Detection

computer-vision

Model card Files Files and versions

dmarsili commited on Dec 11, 2025

Commit

b2cc72d

·

verified ·

1 Parent(s): 07cd2e7

Update README.md

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: mit
----

+---
+license: mit
+datasets:
+- lmms-lab/GQA
+- dmarsili/Omni3D-Bench
+- cambridgeltl/vsr_random
+- snowclipsed/TallyQA
+language:
+- en
+base_model:
+- ShilongLiu/GroundingDINO
+pipeline_tag: object-detection
+tags:
+- object-detection
+- computer-vision
+---
+# Model Card for VALOR-GroundingDINO
+This is the verified-tuned GroundingDINO model from the paper: [No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers](https://glab-caltech.github.io/valor/)
+For further information please refer to the [project webpage](https://glab-caltech.github.io/valor/), [paper](https://arxiv.org/abs/2512.08889), and [repository](https://github.com/damianomarsili/VALOR).
+## Citation
+If you use VALOR in your research, please consider citing our work:
+**BibTeX:**
+```
+@misc{marsili2025labelsproblemtrainingvisual,
+      title={No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers},
+      author={Damiano Marsili and Georgia Gkioxari},
+      year={2025},
+      eprint={2512.08889},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2512.08889},
+}
+```