reinforcement learning - a nezubn Collection

nezubn 's Collections

reinforcement learning

reinforcement learning

updated May 24, 2024

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published May 14, 2024 • 18
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 41