Prompt replay: speeding up grpo with on-policy reuse of high-signal prompts Paper • 2603.21177 • Published Mar 22 • 1 • 1
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129 • 9
Qualixar OS: A Universal Operating System for AI Agent Orchestration Paper • 2604.06392 • Published Apr 7 • 19 • 6