CACARA: Cross-Modal Alignment Leveraging a Text-Centric Approach for Cost-Effective Multimodal and Multilingual Learning Paper • 2512.00496 • Published Nov 29, 2025
Enhancing Speech Emotion Recognition with Graph-Based Multimodal Fusion and Prosodic Features for the Speech Emotion Recognition in Naturalistic Conditions Challenge at Interspeech 2025 Paper • 2506.02088 • Published Jun 2, 2025
FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion Paper • 2501.05586 • Published Jan 9, 2025
FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models Paper • 2409.19474 • Published Sep 28, 2024
Brazilian Portuguese Speech Recognition Using Wav2vec 2.0 Paper • 2107.11414 • Published Jul 23, 2021
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge Paper • 2207.14418 • Published Jul 29, 2022
CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages Paper • 2310.13683 • Published Oct 20, 2023