|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-1.5B-Instruct |
|
|
tags: |
|
|
- safety |
|
|
- alignment |
|
|
--- |
|
|
|
|
|
# Qwen2.5-1.5B-Instruct-EASE |
|
|
This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the [EASE-SafetyReasoning](https://huggingface.co/datasets/HaonanShi/EASE-STAR41K-SafetyReasoning-10K) dataset. |
|
|
|
|
|
## Model description |
|
|
This is the safety reasoning aligned version model under the framework,[**EASE**](https://arxiv.org/pdf/2511.06512). We fine-tune Qwen2.5-1.5B-Instruct to enable **adaptive safety reasoning activation**. The model triggers explicit safety reasoning only under jailbreak-like semantics, while avoiding unnecessary safety reasoning on benign or general prompts. This design aims to maintain the model’s general task effectiveness and efficiency, while improving robustness against jailbreak attacks. |
|
|
|
|
|
## Intended use |
|
|
Safety-oriented research on:(1)Safety alignment, (2)Small language models and (3)Jailbreak robustness |
|
|
|
|
|
## Citation |
|
|
If our model could help you, please cite our paper, thanks!🤗 |
|
|
|
|
|
**EASE: Practical and Efficient Safety Alignment for Small Language Models(AAAI26)(oral)** |
|
|
|
|
|
https://arxiv.org/pdf/2511.06512 |
|
|
|
|
|
|