Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 44
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.12k
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing Paper • 2110.13900 • Published Oct 26, 2021 • 3
Moshi: a speech-text foundation model for real-time dialogue Paper • 2410.00037 • Published Sep 17, 2024 • 16
view article Article How to train a new language model from scratch using Transformers and Tokenizers julien-c • Feb 14, 2020 • 61