base_model:
- mistralai/Mistral-7B-Instruct-v0.2
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
RESM-Mistral-7B
This repository contains the model weights for RESM-Mistral-7B, introduced in the paper Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging.
Model Description
RESM-Mistral-7B is an aligned Large Language Model (LLM) based on the Mistral-7B-Instruct-v0.2 architecture. It is designed to achieve a balanced alignment of Helpfulness, Honesty, and Harmlessness (3H optimization).
The model was created using a novel parameter-level merging method called RESM (Reweighting Enhanced task Singular Merging). RESM addresses the challenges of preference noise accumulation and layer sparsity adaptation through:
- Outlier weighting: Mitigating the impact of preference noise.
- Sparsity-aware rank selection: Adapting to layer-wise variations in parameter importance.
This approach allows the model to better navigate the collaborative and conflicting relationships between the 3H dimensions compared to traditional data mixture strategies.
Citation
@article{yang2025mix,
title={Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging},
author={Yang, Jinluan and Jin, Dingnan and Tang, Anke and Shen, Li and Zhu, Didi and Chen, Zhengyu and Wang, Daixin and Cui, Qing and Zhang, Zhiqiang and Zhou, Jun and others},
journal={arXiv preprint arXiv:2502.06876},
year={2025}
}