MutiModal_Dataset
updated
Updated • 35.2k
• 120
Updated • 2.89k
• 136
WildVision/wildvision-chat
Viewer
• Updated • 45.2k • 417
• 20
Viewer
• Updated • 12.4M • 4.82k
• 172
lmms-lab/LLaVA-Video-178K
Viewer
• Updated • 1.63M • 40.1k
• 194
Viewer
• Updated • 7.29M • 254
• 50
Viewer
• Updated • 1.66M • 19
VILA-U: a Unified Foundation Model Integrating Visual Understanding and
Generation
Paper
• 2409.04429
• Published
Viewer
• Updated • 235M • 5.68k
• 46
Viewer
• Updated • 9.81M • 823
• 54
JefferyZhan/Language-prompted-Localization-Dataset
Preview
• Updated • 119
• 4
Viewer
• Updated • 392 • 67
• 12
mlfoundations/MINT-1T-HTML
Viewer
• Updated • 623M • 189k
• 95
DINO-X: A Unified Vision Model for Open-World Object Detection and
Understanding
Paper
• 2411.14347
• Published • 16
Preview
• Updated • 210
• 52
Viewer
• Updated • 72.5k • 282
• 10
Viewer
• Updated • 10.9M • 131
• 9
Viewer
• Updated • 2.18M • 46
• 2
Viewer
• Updated • 110k • 441
• 4
Salesforce/blip3-grounding-50m
Viewer
• Updated • 52.4M • 625
• 28
Intelligent-Internet/II-Thought-RL-v0
Viewer
• Updated • 342k • 903
• 54
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and
Verifiable Mathematical Dataset for Advancing Reasoning
Paper
• 2504.11456
• Published • 12
Viewer
• Updated • 217M • 40.2k
• 120