TradingAgents: Multi-Agents LLM Financial Trading Framework Paper • 2412.20138 • Published Dec 28, 2024 • 102
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories Paper • 2605.21468 • Published May 20 • 51
MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning Paper • 2605.14212 • Published May 14 • 18
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 233
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 53
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 126
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 372
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 229
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published Mar 16 • 187
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 32 items • Updated 13 days ago • 62
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation Paper • 2502.16707 • Published Feb 23, 2025 • 14
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 105
view article Article Open Responses: What you need to know +2 evalstate, burtenshaw, merve, pcuenq • Jan 15 • 112
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 110