HakHan commited on
Commit
c5c7f70
·
verified ·
1 Parent(s): c6a41ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -1,3 +1,5 @@
 
 
1
  Refer to our [code repo](https://github.com/Hanpx20/SafeSwitch) for usage.
2
 
3
  `refusal_head.pth`: the refusal head.
 
1
+ Model for paper [SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals](arxiv.org/abs/2502.01042).
2
+
3
  Refer to our [code repo](https://github.com/Hanpx20/SafeSwitch) for usage.
4
 
5
  `refusal_head.pth`: the refusal head.