NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 8 days ago • 62
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models Paper • 2404.03543 • Published Apr 4, 2024 • 18