TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering Paper • 2404.01476 • Published Apr 1, 2024 • 1
chuyishang/Qwen2.5-VL-3B-Instruct-SFT_qwen2.5_mmk12_tim_50split_qtemplate Image-to-Text • 4B • Updated May 14 • 10
chuyishang/Qwen2.5-VL-3B-Instruct-SFT_qwen2.5_mmk12_tim_50split_qtemplate Image-to-Text • 4B • Updated May 14 • 10
chuyishang/Qwen2.5-VL-3B-Instruct-SFT_qwen2.5_mmk12_tim_50split Image-to-Text • 4B • Updated May 14 • 9
chuyishang/Qwen2.5-VL-3B-Instruct-SFT_qwen2.5_mmk12_tim_50split Image-to-Text • 4B • Updated May 14 • 9
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Paper • 2406.15334 • Published Jun 21, 2024 • 9