🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated 11 days ago • 12
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 61