Dataset Mix for Pre-Training SLMs
updated
open-thoughts/OpenThoughts-114k
Viewer
• Updated • 228k • 143k
• 833
open-r1/OpenThoughts-114k-math
Viewer
• Updated • 89.1k • 811
• 92
Viewer
• Updated • 52.5B • 650k
• 2.77k
FreedomIntelligence/medical-o1-reasoning-SFT
Viewer
• Updated • 90.1k • 7.44k
• 1.09k
Viewer
• Updated • 860k • 36.7k
• 568
Viewer
• Updated • 188k • 20
Viewer
• Updated • 30.4k • 5
ChayanM/MIMIC-Impression-Dataset
Viewer
• Updated • 292k • 32
• 2
Updated • 377
• 8
Viewer
• Updated • 2.21M • 10.5k
• 99
EleutherAI/SmolLM2-135M-10B
Viewer
• Updated • 10.1M • 2.47k
• 1