LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning Paper • 2601.10129 • Published 2 days ago • 5
HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex Diagrams in Coding Tasks Paper • 2410.12381 • Published Oct 16, 2024 • 43