view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19 • 38
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21 • 244
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 Jan 18, 2024 • 75