Zhe-Zhang's picture
Update README.md
052236a verified

路 GPT2-small architecture

路 Randomly initialized

路 Distilled on BabyLM dataset (10M) using teacher model GPT2-large-BabyLM