nielsr's picture
nielsr HF Staff
Improve model card: add metadata, paper link, and description
ad313a3 verified
|
raw
history blame
1.66 kB
metadata
base_model:
  - mistralai/Mistral-7B-Instruct-v0.2
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation

RESM-Mistral-7B

This repository contains the model weights for RESM-Mistral-7B, introduced in the paper Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging.

Model Description

RESM-Mistral-7B is an aligned Large Language Model (LLM) based on the Mistral-7B-Instruct-v0.2 architecture. It is designed to achieve a balanced alignment of Helpfulness, Honesty, and Harmlessness (3H optimization).

The model was created using a novel parameter-level merging method called RESM (Reweighting Enhanced task Singular Merging). RESM addresses the challenges of preference noise accumulation and layer sparsity adaptation through:

  • Outlier weighting: Mitigating the impact of preference noise.
  • Sparsity-aware rank selection: Adapting to layer-wise variations in parameter importance.

This approach allows the model to better navigate the collaborative and conflicting relationships between the 3H dimensions compared to traditional data mixture strategies.

Citation

@article{yang2025mix,
  title={Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging},
  author={Yang, Jinluan and Jin, Dingnan and Tang, Anke and Shen, Li and Zhu, Didi and Chen, Zhengyu and Wang, Daixin and Cui, Qing and Zhang, Zhiqiang and Zhou, Jun and others},
  journal={arXiv preprint arXiv:2502.06876},
  year={2025}
}