Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/huawei-noah/TinyBERT_General_4L_312D/README.md
README.md
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
TinyBERT: Distilling BERT for Natural Language Understanding
|
| 2 |
+
========
|
| 3 |
+
TinyBERT is 7.5x smaller and 9.4x faster on inference than BERT-base and achieves competitive performances in the tasks of natural language understanding. It performs a novel transformer distillation at both the pre-training and task-specific learning stages. In general distillation, we use the original BERT-base without fine-tuning as the teacher and a large-scale text corpus as the learning data. By performing the Transformer distillation on the text from general domain, we obtain a general TinyBERT which provides a good initialization for the task-specific distillation. We here provide the general TinyBERT for your tasks at hand.
|
| 4 |
+
|
| 5 |
+
For more details about the techniques of TinyBERT, refer to our paper:
|
| 6 |
+
[TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351)
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
Citation
|
| 10 |
+
========
|
| 11 |
+
If you find TinyBERT useful in your research, please cite the following paper:
|
| 12 |
+
```
|
| 13 |
+
@article{jiao2019tinybert,
|
| 14 |
+
title={Tinybert: Distilling bert for natural language understanding},
|
| 15 |
+
author={Jiao, Xiaoqi and Yin, Yichun and Shang, Lifeng and Jiang, Xin and Chen, Xiao and Li, Linlin and Wang, Fang and Liu, Qun},
|
| 16 |
+
journal={arXiv preprint arXiv:1909.10351},
|
| 17 |
+
year={2019}
|
| 18 |
+
}
|
| 19 |
+
```
|