YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Baseline โ€” FineWeb-Edu Validation Run

Results

  • Model: baseline (30.1M params)
  • Dataset: FineWeb-Edu (10BT sample)
  • Tokens trained: 600M / 600M target
  • Best eval loss: 4.3645 (perplexity 78.6)

Training curve

Step Train loss Eval loss Tokens seen
500 7.7696 6.6325 8M
1,000 6.3483 6.2159 16M
1,500 6.0044 5.9292 25M
2,000 5.7224 5.7294 33M
2,500 5.5025 5.5750 41M
3,000 5.3592 5.4477 49M
3,500 5.2691 5.3148 57M
4,000 5.2516 5.2179 66M
4,500 5.1663 5.1339 74M
5,000 5.0529 5.0734 82M
5,500 4.9488 5.0208 90M
6,000 4.8459 4.9849 98M
6,500 4.8417 4.8878 106M
7,000 4.8620 4.8170 115M
7,500 4.8097 4.7894 123M
8,000 4.7614 4.7524 131M
8,500 4.6882 4.7186 139M
9,000 4.6049 4.7072 147M
9,500 4.6094 4.6697 156M
10,000 4.6656 4.6475 164M
10,500 4.6489 4.6073 172M
11,000 4.6001 4.6066 180M
11,500 4.5211 4.5850 188M
12,000 4.4750 4.5846 197M
12,500 4.5026 4.5556 205M
13,000 4.5818 4.5301 213M
13,500 4.5706 4.5150 221M
14,000 4.5056 4.4996 229M
14,500 4.4607 4.5010 238M
15,000 4.4036 4.5035 246M
15,500 4.4296 4.4854 254M
16,000 4.5026 4.4645 262M
16,500 4.4995 4.4581 270M
17,000 4.4517 4.4465 279M
17,500 4.4045 4.4473 287M
18,000 4.3482 4.4493 295M
18,500 4.3925 4.4374 303M
19,000 4.4746 4.4286 311M
19,500 4.4622 4.4225 319M
20,000 4.4266 4.4193 328M
20,500 4.3785 4.4204 336M
21,000 4.3252 4.4205 344M
21,500 4.3798 4.4139 352M
22,000 4.4606 4.4079 360M
22,500 4.4519 4.4042 369M
23,000 4.4135 4.4074 377M
23,500 4.3586 4.4096 385M
24,000 4.3054 4.4122 393M
24,500 4.3714 4.4034 401M
25,000 4.4514 4.3967 410M
25,500 4.4385 4.3967 418M
26,000 4.3937 4.3953 426M
26,500 4.3422 4.4000 434M
27,000 4.2983 4.4036 442M
27,500 4.3838 4.3924 451M
28,000 4.4430 4.3854 459M
28,500 4.4284 4.3836 467M
29,000 4.3919 4.3861 475M
29,500 4.3394 4.3890 483M
30,000 4.3080 4.3907 492M
30,500 4.3546 4.3829 500M
31,000 4.4281 4.3777 508M
31,500 4.4159 4.3781 516M
32,000 4.3639 4.3766 524M
32,500 4.3124 4.3788 532M
33,000 4.3013 4.3770 541M
33,500 4.3538 4.3733 549M
34,000 4.4177 4.3687 557M
34,500 4.3977 4.3659 565M
35,000 4.3555 4.3674 573M
35,500 4.2929 4.3711 582M
36,000 4.2843 4.3698 590M
36,500 4.3488 4.3645 598M
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support