Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music Paper • 2604.10905 • Published Apr 13 • 29
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 93
Cosmos-Preidct1 Collection ⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 14 items • Updated 19 days ago • 304
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data Paper • 2410.02056 • Published Oct 2, 2024 • 6
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities Paper • 2406.11768 • Published Jun 17, 2024 • 24