FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learning Paper • 2605.22552 • Published 11 days ago • 2
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 12 days ago • 204
VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation Paper • 2605.16079 • Published 17 days ago • 28
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 19 days ago • 269
StableI2I: Spotting Unintended Changes in Image-to-Image Transition Paper • 2605.04453 • Published 26 days ago • 11
ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation Paper • 2604.19211 • Published Apr 21 • 11
Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance Paper • 2604.01848 • Published Apr 3 • 5
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published Apr 9 • 115
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630
LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset Paper • 2603.23607 • Published Mar 24 • 19
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 352
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published Mar 26 • 156
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published Feb 26 • 150
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs Paper • 2602.10388 • Published Feb 11 • 245