RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation Paper • 2601.08430 • Published 16 days ago • 58
DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry Paper • 2512.11558 • Published Dec 12, 2025 • 44
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Paper • 2508.13618 • Published Aug 19, 2025 • 18
audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim Audio Classification • 0.2B • Updated Sep 19, 2024 • 831k • 147
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Paper • 2506.18095 • Published Jun 22, 2025 • 66
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7, 2025 • 47
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7, 2025 • 47
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published Feb 18, 2025 • 86