When Does Trajectory-Level Supervision Permit Efficient Offline Reinforcement Learning? Paper • 2606.18531 • Published 16 days ago • 4
Understanding the Challenges in Iterative Generative Optimization with LLMs Paper • 2603.23994 • Published Mar 25 • 29
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States Paper • 2603.19987 • Published Mar 20 • 9
view article Article Open-source DeepResearch – Freeing our search agents +3 m-ric, albertvillanova, merve, thomwolf, clefourrier • Feb 4, 2025 • 1.32k