view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 β’ 94
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper β’ 2503.11647 β’ Published Mar 14 β’ 146
Running 3.6k The Ultra-Scale Playbook π 3.6k The ultimate guide to training LLM on large GPU Clusters
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer Paper β’ 2403.10301 β’ Published Mar 15, 2024 β’ 54