Artanic30
/

NoisyGRPO_7B

Reinforcement Learning

Model card Files Files and versions

Artanic30 commited on Jan 2

Commit

21ee4d4

·

verified ·

1 Parent(s): 97583c5

Update README.md

Files changed (1) hide show

README.md +21 -3

README.md CHANGED Viewed

@@ -1,3 +1,21 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- Qwen/Qwen2.5-VL-7B-Instruct
+pipeline_tag: reinforcement-learning
+datasets:
+- yifanzhang114/MM-RLHF
+---
+This is the official checkpoint released for the NeurIPS 2025 paper NoisyGRPO.
+For model usage, please follow the instructions in [qwen2.5-vl](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
+## References
+* [Model Paper](www.huggingface.co/papers/2510.21122)