Update README.md

Adds citation info

Files changed (1) hide show

README.md CHANGED Viewed

	@@ -27,3 +27,14 @@ The model consists of 1.1 Billion parameters with the following specifications:
27
28	The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.
29

 The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.
+If you use Efficient Attention or Hummingbird, please cite our paper:
+```
+@article{Hosseinis24BetterAttention,
+  title      = {You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism},
+  author     = {Hosseini, Mehran and Hosseini, Peyman},
+  journal    = {arXiv preprint arXiv:2403.01643},
+  year       = {2024}
+}
+```