Update README.md
Browse files
README.md
CHANGED
|
@@ -12,29 +12,34 @@ language:
|
|
| 12 |
- en
|
| 13 |
---
|
| 14 |
|
| 15 |
-
<!-- This model card has been generated automatically according to the information Keras had access to. You should
|
| 16 |
-
probably proofread and complete it, then remove this comment. -->
|
| 17 |
-
|
| 18 |
# EconoBert
|
| 19 |
|
| 20 |
-
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on
|
| 21 |
-
It achieves the following results on the
|
| 22 |
|
|
|
|
|
|
|
| 23 |
|
| 24 |
## Model description
|
| 25 |
|
| 26 |
-
|
| 27 |
|
| 28 |
## Intended uses & limitations
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
## Training and evaluation data
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
## Training procedure
|
| 37 |
|
|
|
|
|
|
|
| 38 |
### Training hyperparameters
|
| 39 |
|
| 40 |
The following hyperparameters were used during training:
|
|
@@ -43,6 +48,7 @@ The following hyperparameters were used during training:
|
|
| 43 |
|
| 44 |
### Training results
|
| 45 |
|
|
|
|
| 46 |
|
| 47 |
|
| 48 |
### Framework versions
|
|
|
|
| 12 |
- en
|
| 13 |
---
|
| 14 |
|
|
|
|
|
|
|
|
|
|
| 15 |
# EconoBert
|
| 16 |
|
| 17 |
+
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on this dataset: (https://huggingface.co/datasets/samchain/BIS_Speeches_97_23)
|
| 18 |
+
It achieves the following results on the test set:
|
| 19 |
|
| 20 |
+
- Accuracy for MLM task: 73%
|
| 21 |
+
- Accuracy for NSP task: 95%
|
| 22 |
|
| 23 |
## Model description
|
| 24 |
|
| 25 |
+
The model is a simple fine-tuning of a base bert on a dataset specific to the domain of economics. It follows the same architecture and no resize_token_embeddings were required.
|
| 26 |
|
| 27 |
## Intended uses & limitations
|
| 28 |
|
| 29 |
+
This model should be used as a backbone for NLP tasks applied to the domain of economics, politics and finance.
|
| 30 |
|
| 31 |
## Training and evaluation data
|
| 32 |
|
| 33 |
+
The dataset used as a fine-tuning domain is : https://huggingface.co/datasets/samchain/BIS_Speeches_97_23
|
| 34 |
+
|
| 35 |
+
The dataset is made of 773k pairs of sentences, an half being negative pairs (meaning sequence A and B are not related) and the other half positive (sequence B follows sequence A).
|
| 36 |
+
|
| 37 |
+
The test set is made of 136k pairs.
|
| 38 |
|
| 39 |
## Training procedure
|
| 40 |
|
| 41 |
+
The model has been fine tuned on 2 epochs, with a batch size of 64 and a sequence length of 128. I used Adam learning-rate with a value of 1e-5,
|
| 42 |
+
|
| 43 |
### Training hyperparameters
|
| 44 |
|
| 45 |
The following hyperparameters were used during training:
|
|
|
|
| 48 |
|
| 49 |
### Training results
|
| 50 |
|
| 51 |
+
Training loss is 1.6046 on train set and 1.47 on test set.
|
| 52 |
|
| 53 |
|
| 54 |
### Framework versions
|