Collections
Discover the best community collections!
Collections trending this week
-
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 27 -
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Paper • 2406.02900 • Published • 13 -
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
Understanding and Diagnosing Deep Reinforcement Learning
Paper • 2406.16979 • Published • 10
-
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 27 -
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Paper • 2406.02900 • Published • 13 -
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
Understanding and Diagnosing Deep Reinforcement Learning
Paper • 2406.16979 • Published • 10