Update README.md
Browse files
README.md
CHANGED
|
@@ -50,8 +50,11 @@ widget:
|
|
| 50 |
|
| 51 |
## Model description
|
| 52 |
|
| 53 |
-
|
| 54 |
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
## Intended uses and limitations
|
| 57 |
|
|
|
|
| 50 |
|
| 51 |
## Model description
|
| 52 |
|
| 53 |
+
This model is a distilled version of [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2). It follows the same training procedure as [DistilBERT](https://arxiv.org/abs/1910.01108). The code for the distillation process can be found [HERE_TODO](https://github.com/TeMU-BSC/distillation).
|
| 54 |
|
| 55 |
+
The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base). On average, it is twice as fast as its teacher.
|
| 56 |
+
|
| 57 |
+
We encourage users of this model to check out the [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) model card to learn more details about the training and evaluation data.
|
| 58 |
|
| 59 |
## Intended uses and limitations
|
| 60 |
|