Add model card for Personalized-Qwen2.5-7B-Instruct

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ ---
6
+
7
+ # Personalized-Qwen2.5-7B-Instruct
8
+
9
+ This repository hosts the `Personalized-Qwen2.5-7B-Instruct` model, developed as part of the research presented in the paper:
10
+
11
+ **[Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning](https://huggingface.co/papers/2510.18849)**
12
+
13
+ ## Abstract
14
+ Faithfully personalizing large language models (LLMs) to align with individual user preferences is a critical but challenging task. While supervised fine-tuning (SFT) quickly reaches a performance plateau, standard reinforcement learning from human feedback (RLHF) also struggles with the nuances of personalization. Scalar-based reward models are prone to reward hacking which leads to verbose and superficially personalized responses. To address these limitations, we propose Critique-Post-Edit, a robust reinforcement learning framework that enables more faithful and controllable personalization. Our framework integrates two key components: (1) a Personalized Generative Reward Model (GRM) that provides multi-dimensional scores and textual critiques to resist reward hacking, and (2) a Critique-Post-Edit mechanism where the policy model revises its own outputs based on these critiques for more targeted and efficient learning. Under a rigorous length-controlled evaluation, our method substantially outperforms standard PPO on personalization benchmarks. Personalized Qwen2.5-7B achieves an average 11\% win-rate improvement, and personalized Qwen2.5-14B model surpasses the performance of GPT-4.1. These results demonstrate a practical path to faithful, efficient, and controllable personalization.
15
+
16
+ ## Project Overview
17
+ This project provides a complete pipeline for training and evaluating large language models using the **Critique-Post-Edit** method. It includes scripts and configurations for Supervised Fine-Tuning (SFT) and Reinforcement Learning (PPO), leveraging powerful open-source frameworks like `LLaMA-Factory` and `verl`. The evaluation is conducted using `AlpacaEval` to ensure fair and comprehensive assessment of model performance. Our released models, including `Personalized-Qwen2.5-7B-Instruct` and `Personalized-Qwen2.5-14B-Instruct`, demonstrate significant improvements over the baseline models.
18
+
19
+ For more details on the framework, training, and evaluation, please refer to the [official GitHub repository](https://github.com/OPPO-PersonalAI/Critique-Post-Edit).
20
+
21
+ ## License
22
+ This project is licensed under the Apache License Version 2.0.
23
+
24
+ ## Citation
25
+ If you find this work helpful, please consider citing the original paper:
26
+
27
+ ```bibtex
28
+ @article{zhu2025towards,
29
+ title={Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning},
30
+ author={Zhu, Chenghao and Tao, Meiling and Wang, Tiannan and Ding, Dongyi and Jiang, Yuchen Eleanor and Zhou, Wangchunshu},
31
+ journal={arXiv preprint arXiv:2510.18849},
32
+ year={2025}
33
+ }
34
+ ```