alignment_24_best - a u-brixton Collection

u-brixton 's Collections

foundation_models

alignment_24_best

monte_carlo_24_best

alignment_24_best

updated Oct 21, 2024

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 22
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23, 2024 • 12
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12, 2024 • 10
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Paper • 2402.14740 • Published Feb 22, 2024 • 18
Binary Classifier Optimization for Large Language Model Alignment

Paper • 2404.04656 • Published Apr 6, 2024 • 2
Noise Contrastive Alignment of Language Models with Explicit Rewards

Paper • 2402.05369 • Published Feb 8, 2024 • 2
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 37
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7, 2024 • 35
Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18
ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 72
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Paper • 2405.21046 • Published May 31, 2024 • 4
From r to Q^*: Your Language Model is Secretly a Q-Function

Paper • 2404.12358 • Published Apr 18, 2024 • 2
Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29, 2024 • 15
Towards Scalable Automated Alignment of LLMs: A Survey

Paper • 2406.01252 • Published Jun 3, 2024 • 3
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Paper • 2404.14367 • Published Apr 22, 2024 • 1
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

Paper • 2404.14313 • Published Apr 22, 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 3
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Paper • 2404.10719 • Published Apr 16, 2024 • 6
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2, 2024 • 46
Building Math Agents with Multi-Turn Iterative Preference Learning

Paper • 2409.02392 • Published Sep 4, 2024 • 16
Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning

Paper • 2406.17312 • Published Jun 25, 2024
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2, 2024 • 33
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 25
trl-lib/kto-mix-14k

Viewer • Updated Mar 25, 2024 • 15k • 129 • 9
Towards Efficient and Exact Optimization of Language Model Alignment

Paper • 2402.00856 • Published Feb 1, 2024 • 2
HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2, 2024 • 24
General Preference Modeling with Preference Representations for Aligning Language Models

Paper • 2410.02197 • Published Oct 3, 2024 • 9
Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult

Paper • 2409.17545 • Published Sep 26, 2024 • 20
RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Paper • 2410.04612 • Published Oct 6, 2024
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Paper • 2410.11677 • Published Oct 15, 2024 • 1