view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment NormalUhr • Feb 11, 2025 • 125
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published 9 days ago • 204
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG Paper • 2604.14572 • Published Apr 16 • 7