Tucano2 Collection An open suite of large language models (LLMs) with 0.5-3.7 billion parameters, designed to address the gap in open-source development for Portuguese. • 33 items • Updated 1 day ago • 12
Running on CPU Upgrade 158 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 158 Explore synthetic data experiments in a bookshelf view
Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi Paper • 2603.03508 • Published 9 days ago • 3
Tucano2 Collection An open suite of large language models (LLMs) with 0.5-3.7 billion parameters, designed to address the gap in open-source development for Portuguese. • 33 items • Updated 1 day ago • 12
LilTii Collection A 0.6B Bengali Language Model that Outperforms Qwen. • 8 items • Updated 8 days ago
LilMoo Collection A 0.6-billion-parameter Hindi language models trained entirely from scratch. • 9 items • Updated 8 days ago • 1