RoboCurate: Harnessing Diversity with Action-Verified Neural Trajectory for Robot Learning Paper • 2602.18742 • Published 21 days ago • 11
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11, 2025 • 110