VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection Paper • 2505.20289 • Published May 26, 2025 • 11
MAOAM: Unified Object and Material Selection with Vision-Language Models Paper • 2606.04880 • Published 27 days ago • 10
From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing Paper • 2605.15181 • Published May 14 • 12
Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper • 2604.13151 • Published Apr 14 • 25 • 3
Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper • 2604.13151 • Published Apr 14 • 25
thaoshibe/relsim-anonymous-caption-qwen25vl-lora Image-Text-to-Text • Updated Dec 13, 2025 • 3 • 1
thaoshibe/relsim-anonymous-caption-qwen25vl-lora Image-Text-to-Text • Updated Dec 13, 2025 • 3 • 1
thaoshibe/relsim-anonymous-caption-qwen25vl-lora Image-Text-to-Text • Updated Dec 13, 2025 • 3 • 1