arxiv:2606.24636
Xinyu Mao
hector-mao
AI & ML interests
Multimodal Large Language Models, Vision Language Models
Recent Activity
authored a paper about 23 hours ago
CineCap: Structured Reasoning with Spatio-Temporal Anchors for Cinematographic Video Captioning updated a model 2 days ago
hector-mao/CineCap-GRPO-8BOrganizations
None yet