continued pretraining of llama3.1 8b on refinedweb for ~80M tokens to try to undo the annealing step and make it act more like an actual base model
From Our Page
community
AI & ML interests
None defined yet.
Recent Activity
View all activity