Running on CPU Upgrade Featured 2.76k The Smol Training Playbook 📚 2.76k The secrets to building world-class LLMs
Vietnamese speech dataset Collection for any speech-related tasks including but not limited to: speech-to-text & text-to-speech, speech classification, speaker verification, etc. • 34 items • Updated Jul 8, 2025 • 37
erax-ai/EraX-WoW-Turbo-V1.1 Automatic Speech Recognition • 0.8B • Updated Mar 31, 2025 • 340 • 14
suzii/vi-whisper-large-v3-turbo-v1 Automatic Speech Recognition • 0.8B • Updated May 6, 2025 • 251 • 13
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 85