low performance on this checkpoint

by pxyu - opened Sep 19, 2024

Sep 19, 2024

Hi,

I am doing some experiments with the BGE-M3 family of models to test the impacts of unsupervised pre-training. Here are some results (R@100 on MIRACL):

MODEL	DE	EN	ES
XLMR + 60M CC News data	722	721	763
BGE RETRO + 60M CC News data	772	774	789
BGE Unsupervised (this repo)	727	758	668
BGE M3	908	907	902

It is obvious that the third row BGE Unsupervised is kind of an outlier here, because the unsupervised pre-training done on your side seem worse than 60M datapoints training on my side. I wonder if you uploaded the wrong checkpoint or that I am not using/evaluating this checkpoint correctly.

Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment