Safetensors

Improve model card and add metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -1,14 +1,27 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
- This model is the official implementation of the paper: https://arxiv.org/abs/2602.07892.
 
 
 
 
 
 
 
 
 
5
 
6
  ## Citation
7
  If you find this model or dataset useful in your research, please cite our paper:
8
 
 
9
  @article{sun2026safety,
10
  title={Safety alignment as continual learning: Mitigating the alignment tax via orthogonal gradient projection},
11
  author={Sun, Guanglong and Zhang, Siyuan and Wang, Liyuan and Zhu, Jun and Su, Hang and Zhong, Yi},
12
  journal={arXiv preprint arXiv:2602.07892},
13
  year={2026}
14
- }
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
  ---
6
+
7
+ # Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
8
+
9
+ This model is the official implementation of the paper [Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection](https://arxiv.org/abs/2602.07892).
10
+
11
+ **OGPSA** (**O**rthogonal **G**radient **P**rojection for **S**afety **A**lignment) is a method that preserves general capabilities during safety alignment via an orthogonal gradient projection strategy, balancing safety with general utility. It estimates a low-rank reference subspace from gradients on a small set of general-capability data and removes from each safety gradient the component lying in this subspace.
12
+
13
+ ## Resources
14
+ - **Paper:** [https://arxiv.org/abs/2602.07892](https://arxiv.org/abs/2602.07892)
15
+ - **Code:** [https://github.com/SunGL001/OGPSA](https://github.com/SunGL001/OGPSA)
16
 
17
  ## Citation
18
  If you find this model or dataset useful in your research, please cite our paper:
19
 
20
+ ```bibtex
21
  @article{sun2026safety,
22
  title={Safety alignment as continual learning: Mitigating the alignment tax via orthogonal gradient projection},
23
  author={Sun, Guanglong and Zhang, Siyuan and Wang, Liyuan and Zhu, Jun and Su, Hang and Zhong, Yi},
24
  journal={arXiv preprint arXiv:2602.07892},
25
  year={2026}
26
+ }
27
+ ```