Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10, 2025 • 103
Speech-to-Text Datasets Collection A collection of STT (Speech-to-Text) datasets compatible with OpenBench. • 5 items • Updated Jul 21, 2025 • 1
Khmer Automation Speech Recognition Collection This project contributes to ASR modeling through a collection of models. • 6 items • Updated Mar 2 • 1