Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 14 days ago • 170
Luciole LLM Collection Open Source LLM in French, English, German, Spanish, Italian, Portuguese, Dutch and Arabic • 8 items • Updated 23 days ago • 9
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction Paper • 2510.20411 • Published Oct 23, 2025 • 2
view article Article Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm royswastik • Mar 19, 2025 • 9
Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies Paper • 2410.22886 • Published Oct 30, 2024 • 1