view article Article How NuminaMath Won the 1st AIMO Progress Prize +6 yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif • Jul 11, 2024 • 128
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks Paper • 2404.14723 • Published Apr 23, 2024 • 10
Archangel Collection Archangel is a suite of human feedback-aligned LLMs, released as part of the Human-Aware Loss Functions (HALOs) project by Ethayarajh et al. (2024). • 77 items • Updated Feb 2, 2024 • 2