Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Paper • 2604.10079 • Published Apr 24 • 1
Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty Paper • 2604.10072 • Published May 3 • 1