Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5, 2025 • 133
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper • 2504.20752 • Published Apr 29, 2025 • 92