CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought Paper • 2409.19510 • Published Sep 29, 2024 • 1
MCGA: A Multi-task Classical Chinese Literary Genre Audio Corpus Paper • 2601.09270 • Published Jan 14 • 1
Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling Paper • 2602.02453 • Published 30 days ago • 36