Update README.md
Browse files
README.md
CHANGED
|
@@ -4,8 +4,7 @@ license: apache-2.0
|
|
| 4 |
# VGR: Visual Grounded Reasoning
|
| 5 |
|
| 6 |
## Overview
|
| 7 |
-
|
| 8 |
-
VGR (Visual Grounded Reasoning) is a novel multimodal large language model (MLLM) designed to enhance fine-grained visual perception and reasoning capabilities. Unlike traditional MLLMs, VGR enables selective attention to visual regions during inference, improving accuracy in complex visual reasoning tasks. It introduces a self-driven selective visual replay mechanism and is trained on a large-scale dataset (VGR-SFT) that integrates visual grounding and language deduction.
|
| 9 |
|
| 10 |
- [Arxiv Paper Link](https://arxiv.org/pdf/2506.11991)
|
| 11 |
- [Data Repository](https://huggingface.co/datasets/BytedanceDouyinContent/VGR)
|
|
|
|
| 4 |
# VGR: Visual Grounded Reasoning
|
| 5 |
|
| 6 |
## Overview
|
| 7 |
+
This is the home page for VGR (Visual Grounded Reasoning): a novel multimodal large language model (MLLM) designed to enhance fine-grained visual perception and reasoning capabilities. Unlike traditional MLLMs, VGR enables selective attention to visual regions during inference, improving accuracy in complex visual reasoning tasks. It introduces a self-driven selective visual replay mechanism and is trained on a large-scale dataset (VGR-SFT) that integrates visual grounding and language deduction.
|
|
|
|
| 8 |
|
| 9 |
- [Arxiv Paper Link](https://arxiv.org/pdf/2506.11991)
|
| 10 |
- [Data Repository](https://huggingface.co/datasets/BytedanceDouyinContent/VGR)
|