Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization Paper • 2505.23387 • Published May 29, 2025 • 8
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15, 2025 • 120
WaterDrum: Watermarking for Data-centric Unlearning Metric Paper • 2505.05064 • Published May 8, 2025 • 8