view article Article Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty? zhangchenxu • Feb 25 • 14
TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments Paper • 2510.01179 • Published Oct 1, 2025 • 28
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 118