Geyang
geyang627
AI & ML interests
None yet
Recent Activity
upvoted a paper about 12 hours ago
Safe and Scalable Web Agent Learning via Recreated Websites upvoted an article 15 days ago
Deriving the PPO Loss from First Principles upvoted an article 15 days ago
A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond