ChildVox: A Speech, Audio, and Large Audio-Language Model Benchmark in Understanding and Characterizing Sound across Childhood Paper • 2605.29257 • Published 1 day ago • 3
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 1 day ago • 50
Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data Paper • 2603.07534 • Published Mar 8 • 5
DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning Paper • 2603.12257 • Published Mar 12 • 31
ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation Paper • 2603.11421 • Published Mar 12 • 34
End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions Paper • 2601.17640 • Published Jan 25 • 6
daVinci-Dev: Agent-native Mid-training for Software Engineering Paper • 2601.18418 • Published Jan 26 • 126
Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis Paper • 2601.14417 • Published Jan 20 • 5
tiantiaf/voxlect-spanish-dialect-whisper-large-v3 Audio Classification • 2B • Updated Aug 10, 2025 • 76 • 5
tiantiaf/voxlect-english-dialect-whisper-small Audio Classification • 90.4M • Updated Aug 10, 2025 • 20 • 2
tiantiaf/voxlect-arabic-dialect-whisper-small Audio Classification • 90.4M • Updated Aug 10, 2025 • 5 • 2
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report Paper • 2508.01059 • Published Aug 1, 2025 • 34
Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe Paper • 2508.01691 • Published Aug 3, 2025 • 10
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22, 2025 • 58