ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall Paper • 2510.07896 • Published Oct 9, 2025 • 8
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model Paper • 2602.07422 • Published Feb 7 • 22
Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published Feb 9 • 42
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published Feb 2 • 32
MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published Jan 18 • 49
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20, 2025 • 93