nielsr's picture
nielsr HF Staff
Add metadata and improve model card
8067e7e verified
|
raw
history blame
1.39 kB
metadata
base_model:
  - mistralai/Mistral-7B-Instruct-v0.2
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation

RESM-Mistral-7B

This repository contains the model weights presented in the paper Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging.

Description

This model is a 3H-aligned (Helpfulness, Honesty, and Harmlessness) Large Language Model developed using a novel model merging framework called RESM (Reweighting Enhanced task Singular Merging).

RESM addresses the challenges of preference noise accumulation and layer sparsity adaptation inherent in 3H-aligned LLM merging through outlier weighting and sparsity-aware rank selection strategies. Compared to standard data mixture or traditional merging methods, RESM achieves a more balanced optimization across the three alignment dimensions.

Citation

@article{yang2025mix,
  title={Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging},
  author={Yang, Jinluan and Jin, Dingnan and Tang, Anke and Shen, Li and Zhu, Didi and Chen, Zhengyu and Wang, Daixin and Cui, Qing and Zhang, Zhiqiang and Zhou, Jun and others},
  journal={arXiv preprint arXiv:2502.06876},
  year={2025}
}