DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning Paper ⢠2605.25604 ⢠Published about 1 month ago ⢠138
ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-3B-v1 Visual Document Retrieval ⢠4B ⢠Updated Apr 9 ⢠21 ⢠13
Running Agents 208 Vidore Leaderboard š„ 208 Browse and compare visual document retrieval model scores
ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1 Visual Document Retrieval ⢠8B ⢠Updated Apr 9 ⢠34 ⢠17
ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1 Visual Document Retrieval ⢠8B ⢠Updated Apr 9 ⢠34 ⢠17
ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-3B-v1 Visual Document Retrieval ⢠4B ⢠Updated Apr 9 ⢠21 ⢠13
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning Paper ⢠2508.21104 ⢠Published Aug 28, 2025 ⢠37