YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Baseline โ FineWeb-Edu Validation Run
Results
- Model: baseline (30.1M params)
- Dataset: FineWeb-Edu (10BT sample)
- Tokens trained: 600M / 600M target
- Best eval loss: 4.3645 (perplexity 78.6)
Training curve
| Step | Train loss | Eval loss | Tokens seen |
|---|---|---|---|
| 500 | 7.7696 | 6.6325 | 8M |
| 1,000 | 6.3483 | 6.2159 | 16M |
| 1,500 | 6.0044 | 5.9292 | 25M |
| 2,000 | 5.7224 | 5.7294 | 33M |
| 2,500 | 5.5025 | 5.5750 | 41M |
| 3,000 | 5.3592 | 5.4477 | 49M |
| 3,500 | 5.2691 | 5.3148 | 57M |
| 4,000 | 5.2516 | 5.2179 | 66M |
| 4,500 | 5.1663 | 5.1339 | 74M |
| 5,000 | 5.0529 | 5.0734 | 82M |
| 5,500 | 4.9488 | 5.0208 | 90M |
| 6,000 | 4.8459 | 4.9849 | 98M |
| 6,500 | 4.8417 | 4.8878 | 106M |
| 7,000 | 4.8620 | 4.8170 | 115M |
| 7,500 | 4.8097 | 4.7894 | 123M |
| 8,000 | 4.7614 | 4.7524 | 131M |
| 8,500 | 4.6882 | 4.7186 | 139M |
| 9,000 | 4.6049 | 4.7072 | 147M |
| 9,500 | 4.6094 | 4.6697 | 156M |
| 10,000 | 4.6656 | 4.6475 | 164M |
| 10,500 | 4.6489 | 4.6073 | 172M |
| 11,000 | 4.6001 | 4.6066 | 180M |
| 11,500 | 4.5211 | 4.5850 | 188M |
| 12,000 | 4.4750 | 4.5846 | 197M |
| 12,500 | 4.5026 | 4.5556 | 205M |
| 13,000 | 4.5818 | 4.5301 | 213M |
| 13,500 | 4.5706 | 4.5150 | 221M |
| 14,000 | 4.5056 | 4.4996 | 229M |
| 14,500 | 4.4607 | 4.5010 | 238M |
| 15,000 | 4.4036 | 4.5035 | 246M |
| 15,500 | 4.4296 | 4.4854 | 254M |
| 16,000 | 4.5026 | 4.4645 | 262M |
| 16,500 | 4.4995 | 4.4581 | 270M |
| 17,000 | 4.4517 | 4.4465 | 279M |
| 17,500 | 4.4045 | 4.4473 | 287M |
| 18,000 | 4.3482 | 4.4493 | 295M |
| 18,500 | 4.3925 | 4.4374 | 303M |
| 19,000 | 4.4746 | 4.4286 | 311M |
| 19,500 | 4.4622 | 4.4225 | 319M |
| 20,000 | 4.4266 | 4.4193 | 328M |
| 20,500 | 4.3785 | 4.4204 | 336M |
| 21,000 | 4.3252 | 4.4205 | 344M |
| 21,500 | 4.3798 | 4.4139 | 352M |
| 22,000 | 4.4606 | 4.4079 | 360M |
| 22,500 | 4.4519 | 4.4042 | 369M |
| 23,000 | 4.4135 | 4.4074 | 377M |
| 23,500 | 4.3586 | 4.4096 | 385M |
| 24,000 | 4.3054 | 4.4122 | 393M |
| 24,500 | 4.3714 | 4.4034 | 401M |
| 25,000 | 4.4514 | 4.3967 | 410M |
| 25,500 | 4.4385 | 4.3967 | 418M |
| 26,000 | 4.3937 | 4.3953 | 426M |
| 26,500 | 4.3422 | 4.4000 | 434M |
| 27,000 | 4.2983 | 4.4036 | 442M |
| 27,500 | 4.3838 | 4.3924 | 451M |
| 28,000 | 4.4430 | 4.3854 | 459M |
| 28,500 | 4.4284 | 4.3836 | 467M |
| 29,000 | 4.3919 | 4.3861 | 475M |
| 29,500 | 4.3394 | 4.3890 | 483M |
| 30,000 | 4.3080 | 4.3907 | 492M |
| 30,500 | 4.3546 | 4.3829 | 500M |
| 31,000 | 4.4281 | 4.3777 | 508M |
| 31,500 | 4.4159 | 4.3781 | 516M |
| 32,000 | 4.3639 | 4.3766 | 524M |
| 32,500 | 4.3124 | 4.3788 | 532M |
| 33,000 | 4.3013 | 4.3770 | 541M |
| 33,500 | 4.3538 | 4.3733 | 549M |
| 34,000 | 4.4177 | 4.3687 | 557M |
| 34,500 | 4.3977 | 4.3659 | 565M |
| 35,000 | 4.3555 | 4.3674 | 573M |
| 35,500 | 4.2929 | 4.3711 | 582M |
| 36,000 | 4.2843 | 4.3698 | 590M |
| 36,500 | 4.3488 | 4.3645 | 598M |
- Downloads last month
- 17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support