What languages does snowflake-arctic-embed-m-v2.0 support?

by Ivan22wyh - opened Feb 17, 2025

Feb 17, 2025

Thank you for your excellent work, which has greatly assisted our project. Now, we would like to train a multilingual quality classifier based on snowflake-arctic-embed-m-v2.0, but we are unsure which languages snowflake-arctic-embed-m-v2.0 specifically supports. We hope you can inform us.

lukemerrick

Feb 17, 2025

The best way to find out if this model is a good choice for your fine-tuning task is to try it out. You can look at the training details in our technical report (linked in the news section of our model card) for information about which languages we focused on with our contrastive training, but this may or may not translate strongly into a sense of how classification performance will turn on on a quality classification task.

Other potentially helpful details: The tokenizer is from XLMR and the MLM pretraining details are here: https://arxiv.org/abs/2407.19669

pxyu changed discussion status to closed Jun 4, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment