tei-annotator / docs /evaluation-results.md
cmboulanger's picture
docs: Add evalutation results
f2c1075

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

Evaluation results

# Model Precision Recall F1 (micro) Time (s)
1 [gemini] gemini-2.5-flash 0.932 0.919 0.925 5.1
2 [gemini] gemini-2.5-flash-lite 0.915 0.878 0.897 3.1
3 [kisski] devstral-2-123b-instruct-2512 0.868 0.892 0.880 32.2
4 [kisski] qwen3-coder-30b-a3b-instruct 0.816 0.838 0.827 12.6
5 [kisski] internvl3.5-30b-a3b 0.877 0.770 0.820 8.3
6 [kisski] qwen3-vl-30b-a3b-instruct 0.806 0.784 0.795 11
7 [kisski] qwen3-omni-30b-a3b-instruct 0.800 0.649 0.716 10.3
8 [kisski] apertus-70b-instruct-2509 0.763 0.392 0.518 13.6