Arbitrary Entropy Policy Optimization: Entropy Is Controllable in Reinforcement Fine-tuning Paper • 2510.08141 • Published Oct 9, 2025 • 1
Distribution-Centric Policy Optimization Dominates Exploration-Exploitation Trade-off Paper • 2601.12730 • Published 9 days ago