Spaces:
Runtime error
Runtime error
Gemma results weirdness
#18
by
alternis - opened
Hi,
There are incoherent results between Gemma 3 paper and this eval toolkit. On the paper https://arxiv.org/pdf/2503.19786, they claim a 68.8 score on ChartQA versus 33.7 on the leaderboard ? To be honest ,I was not able to reproduce either since inference simply does not work on gemma with any dataset with code from the VLMEval git.