Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10, 2025 • 51
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World Paper • 2506.24102 • Published Jun 30, 2025
Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis Paper • 2001.01306 • Published Jan 5, 2020
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs Paper • 2510.18876 • Published Oct 21, 2025 • 37
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation Paper • 2204.05525 • Published Apr 12, 2022
CodeDance: A Dynamic Tool-integrated MLLM for Executable Visual Reasoning Paper • 2512.17312 • Published Dec 19, 2025 • 3
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation Paper • 2603.12267 • Published Mar 12 • 13
CodeDance: A Dynamic Tool-integrated MLLM for Executable Visual Reasoning Paper • 2512.17312 • Published Dec 19, 2025 • 3