SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
Paper • 2412.08347 • Published • 4
A collection of models that use SmolLM2 as the pretrained base in conjunction with AllenAI's Tulu 3 post training pipeline.