MoT-PL Collection Polish translation of Mixture-of-Thoughts subset and its variants. • 6 items • Updated 17 days ago
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning Paper • 2510.03259 • Published Sep 26, 2025 • 57
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1, 2025 • 59