long2333
/

OGPSA

Improve model card and add metadata

by nielsr HF Staff - opened 8 days ago

←

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,14 +1,27 @@
 ---
 license: apache-2.0
 ---
-This model is the official implementation of the paper: https://arxiv.org/abs/2602.07892.
 ## Citation
 If you find this model or dataset useful in your research, please cite our paper:
 @article{sun2026safety,
   title={Safety alignment as continual learning: Mitigating the alignment tax via orthogonal gradient projection},
   author={Sun, Guanglong and Zhang, Siyuan and Wang, Liyuan and Zhu, Jun and Su, Hang and Zhong, Yi},
   journal={arXiv preprint arXiv:2602.07892},
   year={2026}
-}

 ---
 license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
+# Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
+This model is the official implementation of the paper [Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection](https://arxiv.org/abs/2602.07892).
+**OGPSA** (**O**rthogonal **G**radient **P**rojection for **S**afety **A**lignment) is a method that preserves general capabilities during safety alignment via an orthogonal gradient projection strategy, balancing safety with general utility. It estimates a low-rank reference subspace from gradients on a small set of general-capability data and removes from each safety gradient the component lying in this subspace.
+## Resources
+- **Paper:** [https://arxiv.org/abs/2602.07892](https://arxiv.org/abs/2602.07892)
+- **Code:** [https://github.com/SunGL001/OGPSA](https://github.com/SunGL001/OGPSA)
 ## Citation
 If you find this model or dataset useful in your research, please cite our paper:
+```bibtex
 @article{sun2026safety,
   title={Safety alignment as continual learning: Mitigating the alignment tax via orthogonal gradient projection},
   author={Sun, Guanglong and Zhang, Siyuan and Wang, Liyuan and Zhu, Jun and Su, Hang and Zhong, Yi},
   journal={arXiv preprint arXiv:2602.07892},
   year={2026}
+}
+```