MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published about 1 month ago • 38
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 8 items • Updated Mar 2 • 112
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published Apr 1 • 13