Papers
arxiv:2505.02666

A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Published on Aug 29, 2025
Authors:
,
,
,
,
,
,

Abstract

Reward design is crucial for aligning large language models with human values, encompassing mathematical formulations, construction practices, and optimization interactions, with evolving trends toward RL-free methods and multi-objective frameworks.

Reward design plays a pivotal role in aligning large language models (LLMs) with human values, serving as the bridge between feedback signals and model optimization. This survey provides a structured organization of reward modeling and addresses three key aspects: mathematical formulation, construction practices, and interaction with optimization paradigms. Building on this, it develops a macro-level taxonomy that characterizes reward mechanisms along complementary dimensions, thereby offering both conceptual clarity and practical guidance for alignment research. The progression of LLM alignment can be understood as a continuous refinement of reward design strategies, with recent developments highlighting paradigm shifts from reinforcement learning (RL)-based to RL-free optimization and from single-task to multi-objective and complex settings.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2505.02666
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.02666 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.02666 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.02666 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.