m-Just/O3-Bench
Viewer
•
Updated
•
345
•
1.42k
•
16
Empowering Multimodal Foundation Models with Generalized Visual Search
Note Can your AI agent truly "think with images"? Test it out on O3-Bench!
Note This is the vSearcher model introduced in our work.
Note In-loop RL training data for vSearcher.
Note Out-of-loop RL training data for vSearcher.