Difference between This and togethercomputer's

by tsteffek - opened Nov 29, 2024

Nov 29, 2024

•

edited Nov 29, 2024

Hi, I'm currently wondering which version to use for document classification on medical texts and stumbled upon this version and https://huggingface.co/togethercomputer/m2-bert-80M-32k. In a github issue Daniel Fu mentions that V1 has seen legal texts, so there seems to be some difference, but I couldn't find a comprehensive list. Can I see that somewhere?

When trying the 2 models I also noticed that this version does indeed have the FlashFFT warnings mentioned in the github, while the other one doesn't. So is only one of them using FlashFFT? (Can these be safely ignored when fine-tuning further?)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment