metadata
base_model:
- mistralai/Mistral-7B-Instruct-v0.2
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
RESM-Mistral-7B
This repository contains the model weights presented in the paper Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging.
Description
This model is a 3H-aligned (Helpfulness, Honesty, and Harmlessness) Large Language Model developed using a novel model merging framework called RESM (Reweighting Enhanced task Singular Merging).
RESM addresses the challenges of preference noise accumulation and layer sparsity adaptation inherent in 3H-aligned LLM merging through outlier weighting and sparsity-aware rank selection strategies. Compared to standard data mixture or traditional merging methods, RESM achieves a more balanced optimization across the three alignment dimensions.
Citation
@article{yang2025mix,
title={Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging},
author={Yang, Jinluan and Jin, Dingnan and Tang, Anke and Shen, Li and Zhu, Didi and Chen, Zhengyu and Wang, Daixin and Cui, Qing and Zhang, Zhiqiang and Zhou, Jun and others},
journal={arXiv preprint arXiv:2502.06876},
year={2025}
}