evaluate it on the OLMOCR benchmark

by yeekal - opened Jan 30

Discussion

yeekal

Jan 30

is there plans to evaluate it on the OLMOCR benchmark and compare it with LightOnOCR-2.

ChengCui

PaddlePaddle org Jan 30

We found that there are quite a few issues with this benchmark during our evaluation of olmocr-bench, as I mentioned in my response to this issue.

ChengCui

PaddlePaddle org Jan 30

@yeekal Currently, the evaluation metrics used by olmOCR-bench have some limitations and cannot effectively and fairly assess a model’s true capability in document parsing. For example, as shown in the figure, olmOCR-bench splits multi-line formulas into individual single-line formulas for evaluation, which does not comply with common standards for formula recognition. This results in abnormally high accuracy for models whose outputs match the training data distribution.
Additionally, using pass rate as the sole metric is too strict. For a complex formula, if the model misrecognizes just one character, the score is 0; if it misrecognizes 100 characters, the score is still 0. This fails to reflect the actual performance differences between models.

hfcove

Feb 1

•

edited Feb 1

@yeekal Currently, the evaluation metrics used by olmOCR-bench have some limitations and cannot effectively and fairly assess a model’s true capability in document parsing. For example, as shown in the figure, olmOCR-bench splits multi-line formulas into individual single-line formulas for evaluation, which does not comply with common standards for formula recognition. This results in abnormally high accuracy for models whose outputs match the training data distribution.
Additionally, using pass rate as the sole metric is too strict. For a complex formula, if the model misrecognizes just one character, the score is 0; if it misrecognizes 100 characters, the score is still 0. This fails to reflect the actual performance differences between models.

应该用一些有难度的文档去比较他们的能力，太简单的文档难以区分他们的能力，能不能输出图片的bbox？

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment