view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 natolambert, LouisCastricato, lvwerra, Dahoas • Dec 9, 2022 • 417
LLM Reasoning Papers Collection improve reasoning capabilities of LLMs • 45 items • Updated Feb 18, 2025 • 6