CWRUSafetyLab
/

Qwen2.5-1.5B-Instruct-EASE

Model card Files Files and versions

HaonanShi commited on 2 days ago

Commit

7301c46

·

verified ·

1 Parent(s): b075d48

Update README.md

Files changed (1) hide show

README.md +2 -8

README.md CHANGED Viewed

@@ -11,16 +11,10 @@ tags:
 This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the [EASE-SafetyReasoning](https://huggingface.co/datasets/HaonanShi/EASE-STAR41K-SafetyReasoning-10K) dataset.
 ## Model description
-This is the safety reasoning aligned version model under the framework,[**EASE**](https://arxiv.org/pdf/2511.06512).
-We fine-tune Qwen2.5-1.5B-Instruct to enable **adaptive safety reasoning activation**.
-The model triggers explicit safety reasoning only under jailbreak-like semantics, while avoiding unnecessary safety reasoning on benign or general prompts.
-This design aims to maintain the model’s general task effectiveness and efficiency, while improving robustness against jailbreak attacks.
 ## Intended use
-- Safety-oriented research on:
-  - safety alignment
-  - small language models
-  - jailbreak robustness
 ## Citation
 If our model could help you, please cite our paper, thanks!🤗

 This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the [EASE-SafetyReasoning](https://huggingface.co/datasets/HaonanShi/EASE-STAR41K-SafetyReasoning-10K) dataset.
 ## Model description
+This is the safety reasoning aligned version model under the framework,[**EASE**](https://arxiv.org/pdf/2511.06512). We fine-tune Qwen2.5-1.5B-Instruct to enable **adaptive safety reasoning activation**. The model triggers explicit safety reasoning only under jailbreak-like semantics, while avoiding unnecessary safety reasoning on benign or general prompts. This design aims to maintain the model’s general task effectiveness and efficiency, while improving robustness against jailbreak attacks.
 ## Intended use
+Safety-oriented research on:(1)Safety alignment, (2)Small language models and (3)Jailbreak robustness
 ## Citation
 If our model could help you, please cite our paper, thanks!🤗