Update README.md
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ It is a fine-tune of **Qwen 2.5-VL-7B** using ~10k synthetic Doc-to-Reasoning-to
|
|
| 35 |
|
| 36 |
**NuMarkdown-reasoning** is significantly better than similar size non-reasoning models trained for markdown generation on complex documents, and achieves competitive results against top closed source alternatives.
|
| 37 |
|
| 38 |
-
### Arena ranking (using trueskill-2 ranking system, with around 500 votes):
|
| 39 |
<p align="center">
|
| 40 |
|
| 41 |
| Rank | Model | μ | σ | μ − 3σ |
|
|
@@ -52,7 +52,7 @@ It is a fine-tune of **Qwen 2.5-VL-7B** using ~10k synthetic Doc-to-Reasoning-to
|
|
| 52 |
|
| 53 |
*We plan to realease a markdown arena, similar to llmArena, for complex document-to-markdown tasks to provide a tool to evaluate different solutions.*
|
| 54 |
|
| 55 |
-
### Win-rate against others models (image-only):
|
| 56 |
<p align="center">
|
| 57 |
<img src="bar plot.png" width="700"/>
|
| 58 |
</p>
|
|
|
|
| 35 |
|
| 36 |
**NuMarkdown-reasoning** is significantly better than similar size non-reasoning models trained for markdown generation on complex documents, and achieves competitive results against top closed source alternatives.
|
| 37 |
|
| 38 |
+
### Arena ranking agains popular alternative (using trueskill-2 ranking system, with around 500 votes):
|
| 39 |
<p align="center">
|
| 40 |
|
| 41 |
| Rank | Model | μ | σ | μ − 3σ |
|
|
|
|
| 52 |
|
| 53 |
*We plan to realease a markdown arena, similar to llmArena, for complex document-to-markdown tasks to provide a tool to evaluate different solutions.*
|
| 54 |
|
| 55 |
+
### Win/Draw/Loose-rate against others models (image-only):
|
| 56 |
<p align="center">
|
| 57 |
<img src="bar plot.png" width="700"/>
|
| 58 |
</p>
|