Improve model card, add library name and links to paper/code

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +32 -5
README.md CHANGED
@@ -1,9 +1,36 @@
1
  ---
2
- license: cc
3
- language:
4
- - en
5
  base_model:
6
  - Qwen/Qwen3.5-4B
7
- pipeline_tag: visual-question-answering
 
 
 
 
8
  ---
9
- arxiv.org/abs/2606.25319
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  base_model:
3
  - Qwen/Qwen3.5-4B
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ pipeline_tag: image-text-to-text
8
+ library_name: transformers
9
  ---
10
+
11
+ # V-Zero: Answer-Label-Free On-Policy Distillation with Contrastive Evidence Gating for Fine-Grained Visual Reasoning
12
+
13
+ This repository contains the V-Zero 4B checkpoint, introduced in the paper [V-Zero: Answer-Label-Free On-Policy Distillation with Contrastive Evidence Gating for Fine-Grained Visual Reasoning](https://arxiv.org/abs/2606.25319).
14
+
15
+ * **Code Repository:** [GitHub - eVI-group-SCU/V-Zero](https://github.com/eVI-group-SCU/V-Zero)
16
+
17
+ ## Overview
18
+
19
+ V-Zero is an answer-label-free framework designed to improve fine-grained visual reasoning in multimodal large language models (MLLMs). It bypasses the need for costly external answer labels or manual verification rules by utilizing on-policy distillation combined with contrastive evidence gating. During training, the student model samples trajectories on the full image, while a teacher model replays those trajectories under paired positive (task-relevant) and negative (task-irrelevant) crops to evaluate student-sampled reasoning paths.
20
+
21
+ <p align="center">
22
+ <img src="https://raw.githubusercontent.com/eVI-group-SCU/V-Zero/main/resource/method.png" alt="V-Zero Method Overview" width="100%">
23
+ </p>
24
+
25
+ ## Citation
26
+
27
+ If you find this work useful for your research, please cite the paper:
28
+
29
+ ```bibtex
30
+ @article{sun2026vzero,
31
+ title={V-Zero: Answer-Label-Free On-Policy Distillation with Contrastive Evidence Gating for Fine-Grained Visual Reasoning},
32
+ author={Sun, Haoxiang and Yi, Zhihang and Deng, Langxuan and Zhou, Yuhao and Jia, Peiqi and Zhao, Jian and Yuan, Li and Lv, Jiancheng and Wang, Tao},
33
+ journal={arXiv preprint arXiv:2606.25319},
34
+ year={2026}
35
+ }
36
+ ```