Geyang
geyang627
AI & ML interests
None yet
Recent Activity
upvoted a paper about 4 hours ago
Safe and Scalable Web Agent Learning via Recreated Websites upvoted an article 14 days ago
Deriving the PPO Loss from First Principles upvoted an article 14 days ago
A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond