Most unslopped base model: https://huggingface.co/allenai/OLMo-2-0325-32B/tree/stage1-step721901-tokens6056B

#1715

by rookaw - opened Jan 21

Discussion

rookaw

Jan 21

•

edited Jan 21

https://huggingface.co/allenai/OLMo-2-0325-32B/tree/stage1-step721901-tokens6056B

Hi! The quants for OLMo-2-0325-32B already exist and are good, but this particular model right at the end of its initial pretraining step is even better for creative writing than the full base model. This is because the mid-training step included synthetic data.

Even with a context length of only 4,096, this is the most unslopped modern base model I could find, probably because it was trained primarily on DCLM-Baseline, which does not include documents from 2023 or later.

I'd like to request quants for the model trained at the specific branch in this link because I keep referencing it in conversations and others are asking to use it. Thank you so much! Suggested model name: OLMo-2-0325-32B-stage1-6T

RichardErkhov

Jan 21

Hi, would be great if you could instead clone it to the repo with a desired name and I would queue it, because we dont have a way to get specific name and branch =)
As soon as you do that I can easily queue it for you

rookaw

Jan 21

@RichardErkhov Here you go! Thank you! https://huggingface.co/rookaw/OLMo-2-0325-32B-stage1-6T

RichardErkhov

Jan 21

It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#OLMo-2-0325-32B-stage1-6T-GGUF for quants to appear.

rookaw

Jan 22

Thank you very much!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment