GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? Paper • 2606.17861 • Published 10 days ago • 55
PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions Paper • 2606.14832 • Published 14 days ago • 12
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published May 18 • 115
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 116
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 80
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published Mar 24, 2025 • 31
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper • 2412.09501 • Published Dec 12, 2024 • 48
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 119
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 148
Step-DPO Collection Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1, 2024 • 5