VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation Paper • 2412.10768 • Published Dec 14, 2024
$\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time Paper • 2506.00358 • Published May 31, 2025
Object-WIPER : Training-Free Object and Associated Effect Removal in Videos Paper • 2601.06391 • Published 20 days ago • 2
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Paper • 2406.07686 • Published Jun 11, 2024 • 17