Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ tags:
|
|
| 21 |
- thinking=0
|
| 22 |
---
|
| 23 |
|
| 24 |
-

|
| 112 |
-
|
| 113 |
* **Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution**
|
| 114 |
-
[https://arxiv.org/pdf/2409.12191](https://arxiv.org/pdf/2409.12191)
|
| 115 |
-
|
| 116 |
* **Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond**
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
* **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
|
| 120 |
-
[https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)
|
| 121 |
-
|
| 122 |
-
* **Ground-R1: Incentivizing Grounded Visual Reasoning via Reinforcement Learning**
|
| 123 |
-
[https://arxiv.org/pdf/2505.20272](https://arxiv.org/pdf/2505.20272)
|
|
|
|
| 21 |
- thinking=0
|
| 22 |
---
|
| 23 |
|
| 24 |
+

|
| 25 |
|
| 26 |
# **Enesidaon-VLR-7B-no-Thinking**
|
| 27 |
|
|
|
|
| 108 |
## References
|
| 109 |
|
| 110 |
* **YaRN: Efficient Context Window Extension of Large Language Models**
|
|
|
|
|
|
|
| 111 |
* **Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution**
|
|
|
|
|
|
|
| 112 |
* **Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond**
|
| 113 |
+
* **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|