SEIF: Self-Evolving Reinforcement Learning for Instruction Following Paper • 2605.07465 • Published 5 days ago • 25
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation Paper • 2508.12040 • Published Aug 16, 2025 • 14
A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models Paper • 2508.12903 • Published Aug 18, 2025 • 11
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 189