Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
konwoo
's Collections
DCLM subsets
Pre-training under infinite compute
Pre-training under infinite compute
updated
Oct 25, 2025
Pre-trained models for https://arxiv.org/abs/2509.14786 (Figure 10)
Upvote
-
konwoo/300m4k-209Mx16-dclm-ens8x0730-0.9-cos-lr0.0030-wd0.10-bs64
0.3B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx16-dclm-sd0805-0.75-cos-lr0.0030-wd0.10-bs64
0.3B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx4-dclm-cos-lr0.0003-wd0.10-bs64
1B
•
Updated
Oct 24, 2025
konwoo/600m4k-209Mx4-dclm-cos-lr0.0010-wd0.10-bs64
0.6B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx8-dclm-cos-lr0.0010-wd0.10-bs64
0.3B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx8-dclm-cos-lr0.0030-wd0.10-bs64
0.2B
•
Updated
Oct 24, 2025
•
1.3k
konwoo/1_4b4k-209Mx8-dclm-cos-lr0.0010-wd3.20-seed0
1B
•
Updated
Oct 24, 2025
konwoo/600m4k-209Mx8-dclm-cos-lr0.0010-wd3.20-seed0
0.6B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx16-dclm-cos-lr0.0030-wd1.60-seed0
0.3B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx16-dclm-cos-lr0.0030-wd0.80-seed0
0.2B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed4
1B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed3
1B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed2
1B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed1
1B
•
Updated
Oct 24, 2025
konwoo/1_4b4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed0
1B
•
Updated
Oct 24, 2025
•
1
konwoo/600m4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed4
0.6B
•
Updated
Oct 24, 2025
konwoo/600m4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed3
0.6B
•
Updated
Oct 24, 2025
konwoo/600m4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed2
0.6B
•
Updated
Oct 24, 2025
•
1
konwoo/600m4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed1
0.6B
•
Updated
Oct 24, 2025
konwoo/600m4k-209Mx16-dclm-cos-lr0.0010-wd1.60-seed0
0.6B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx32-dclm-cos-lr0.0030-wd0.80-seed4
0.3B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx32-dclm-cos-lr0.0030-wd0.80-seed3
0.3B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx32-dclm-cos-lr0.0030-wd0.80-seed2
0.3B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx32-dclm-cos-lr0.0030-wd0.80-seed1
0.3B
•
Updated
Oct 24, 2025
konwoo/300m4k-209Mx32-dclm-cos-lr0.0030-wd0.80-seed0
0.3B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx32-dclm-cos-lr0.0030-wd0.40-seed4
0.2B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx32-dclm-cos-lr0.0030-wd0.40-seed3
0.2B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx32-dclm-cos-lr0.0030-wd0.40-seed2
0.2B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx32-dclm-cos-lr0.0030-wd0.40-seed1
0.2B
•
Updated
Oct 24, 2025
konwoo/150m4k-209Mx32-dclm-cos-lr0.0030-wd0.40-seed0
0.2B
•
Updated
Oct 24, 2025
Pre-training under infinite compute
Paper
•
2509.14786
•
Published
Sep 18, 2025
•
2
Upvote
-
Share collection
View history
Collection guide
Browse collections