Diversity or Precision? A Deep Dive into Next Token Prediction Paper • 2512.22955 • Published Dec 28, 2025 • 8
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient Paper • 2509.26313 • Published Sep 30, 2025 • 5