Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese Paper • 2408.12480 • Published Aug 22, 2024 • 28
UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling Paper • 2408.04810 • Published Aug 9, 2024 • 24
Mediocreatmybest/instructblip-flan-t5-xxl_8bit_nf4 Image-to-Text • 12B • Updated Aug 27, 2023 • 6 • 1