OpenGVLab/InternVL2_5-2B-MPO
Image-Text-to-Text • 2B • Updated
• 133 • 12
Computer Vision
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs