view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) ariG23498 • Jan 19, 2025 • 50
view article Article Exploring Quantization Backends in Diffusers +1 derekl35, marcsun13, sayakpaul • May 21, 2025 • 45
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb • May 21, 2025 • 258
view article Article (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware +3 derekl35, marcsun13, sayakpaul, merve, linoyts • Jun 19, 2025 • 105
view article Article StackLLaMA: A hands-on guide to train LLaMA with RLHF +5 edbeeching, kashif, ybelkada, lewtun, lvwerra, nazneen, natolambert • Apr 5, 2023 • 48
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 kashif, edbeeching, lewtun, lvwerra, osanseviero • Jan 18, 2024 • 83