Pretraining data constrained and cognitively relevant baby LLMs
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data