Add pipeline tag, library name, link to project page and github URL
Browse filesThis PR improves the model card, by:
- adding a relevant pipeline tag, ensuring people can find the model at https://huggingface.co/models?pipeline_tag=text-generation, as well as the `library_name`.
- adding a link to the Github repository, enabling people to easier find the code.
- adding a link to the project page.
README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
The *TokenFormer* is a **fully attention-based architecture**
|
|
@@ -10,6 +12,10 @@ It contains four models of sizes
|
|
| 10 |
All 4 model sizes are trained on the exact
|
| 11 |
same data, in the exact same order.
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
# TokenFormer-450M
|
| 14 |
|
| 15 |
## Model Details
|
|
@@ -92,4 +98,4 @@ TokenFormer compared with Opensource Transformer-based LLMs.
|
|
| 92 |
| Pythia | 2.8B | 64.7 | 59.3 | 74.0 | 64.1 | 32.9 | 59.7 | 59.1 |
|
| 93 |
| **TokenFormer** | 1.5B | **64.7** | 60.0 | **74.8** | **64.8** | 32.0 | 59.7 | **59.3** |
|
| 94 |
<figcaption>Zero-shot evaluation of Language Modeling. </figcaption>
|
| 95 |
-
</figure>
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
pipeline_tag: text-generation
|
| 5 |
---
|
| 6 |
|
| 7 |
The *TokenFormer* is a **fully attention-based architecture**
|
|
|
|
| 12 |
All 4 model sizes are trained on the exact
|
| 13 |
same data, in the exact same order.
|
| 14 |
|
| 15 |
+
Code: https://github.com/Haiyang-W/TokenFormer
|
| 16 |
+
|
| 17 |
+
Project page: https://haiyang-w.github.io/tokenformer.github.io/
|
| 18 |
+
|
| 19 |
# TokenFormer-450M
|
| 20 |
|
| 21 |
## Model Details
|
|
|
|
| 98 |
| Pythia | 2.8B | 64.7 | 59.3 | 74.0 | 64.1 | 32.9 | 59.7 | 59.1 |
|
| 99 |
| **TokenFormer** | 1.5B | **64.7** | 60.0 | **74.8** | **64.8** | 32.0 | 59.7 | **59.3** |
|
| 100 |
<figcaption>Zero-shot evaluation of Language Modeling. </figcaption>
|
| 101 |
+
</figure>
|