PeymanHosseini
/

Hummingbird

Text Generation

Model card Files Files and versions

PeymanHosseini commited on Jun 22, 2024

Commit

2699f75

·

verified ·

1 Parent(s): 115e671

Update README.md

Updates README (Fixes Grammer Errors)

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,14 +16,14 @@ This version of Hummingbird is only meant to demonstrate Efficient Attention for
 ## Model Details
-The models consists of 1.1 Billion parameters with the following specifications:
 | Parameter            | size |
-| -------------------- | ---- |
 | # Transformer Blocks | 10   |
 | Model Dimension      | 3072 |
 | # Heads              | 1    |
-The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heeads to be 1 as an interesting case study, since all current LMs use multiple heads.

 ## Model Details
+The model consists of 1.1 Billion parameters with the following specifications:
 | Parameter            | size |
+| :------------------- | :--- |
 | # Transformer Blocks | 10   |
 | Model Dimension      | 3072 |
 | # Heads              | 1    |
+The Attention Mechanism used is based on our newly proposed Efficient Attention from our paper, *You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism* ([arXiv:2403.01643](https://arxiv.org/abs/2403.01643)). We have chosen the number of heads to be 1 as an interesting case study since all current LMs use multiple heads.