Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 4 days ago • 123
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 11 days ago • 61
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published 24 days ago • 46
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 30 days ago • 78
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published about 1 month ago • 38
LearnWeak Collection Checkpoints trained with LearnWeak (Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Age) • 8 items • Updated 30 days ago
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published about 1 month ago • 93
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published about 1 month ago • 38
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published about 1 month ago • 38
HoliSafe: Holistic Safety Benchmarking and Modeling with Safety Meta Token for Vision-Language Model Paper • 2506.04704 • Published Jun 5, 2025 • 3
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Paper • 2604.14004 • Published Apr 15 • 30
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published May 18 • 30
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published May 18 • 30