Qwen2.5-1.5B-Instruct-EASE
This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the EASE-SafetyReasoning dataset.
Model description
This is the safety reasoning aligned version model under the framework,EASE. We fine-tune Qwen2.5-1.5B-Instruct to enable adaptive safety reasoning activation. The model triggers explicit safety reasoning only under jailbreak-like semantics, while avoiding unnecessary safety reasoning on benign or general prompts. This design aims to maintain the model’s general task effectiveness and efficiency, while improving robustness against jailbreak attacks.
Intended use
Safety-oriented research on:(1)Safety alignment, (2)Small language models and (3)Jailbreak robustness
Citation
If our model could help you, please cite our paper, thanks!🤗
EASE: Practical and Efficient Safety Alignment for Small Language Models(AAAI26)(oral)
- Downloads last month
- 4