InSight-o3 - a m-Just Collection

m-Just 's Collections

InSight-o3

updated Mar 24

Empowering Multimodal Foundation Models with Generalized Visual Search

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search

Paper • 2512.18745 • Published Dec 21, 2025 • 12
m-Just/O3-Bench

Viewer • Updated Jan 26 • 345 • 329 • 16

Note Can your AI agent truly "think with images"? Test it out on O3-Bench!
m-Just/InSight-o3-vS

Image-Text-to-Text • 8B • Updated Jan 29 • 3

Note This is the vSearcher model introduced in our work.
m-Just/VisCoT_VStar_Collage

Viewer • Updated Jan 29 • 15.3k • 145 • 3

Note In-loop RL training data for vSearcher.
m-Just/InfoVQA_RegionLocalization

Viewer • Updated Jan 29 • 10.2k • 48 • 1

Note Out-of-loop RL training data for vSearcher.