Commit
·
d280b81
1
Parent(s):
ff1be4a
Update README.md
Browse files
README.md
CHANGED
|
@@ -20,10 +20,10 @@ This model is the [further-pretrained version of German FinBERT](https://hugging
|
|
| 20 |
|
| 21 |
### Fine-tuning
|
| 22 |
|
| 23 |
-
I fine-tune
|
| 24 |
-
standard parameters.
|
| 25 |
-
values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/).
|
| 26 |
-
|
| 27 |
|
| 28 |
### Results
|
| 29 |
|
|
|
|
| 20 |
|
| 21 |
### Fine-tuning
|
| 22 |
|
| 23 |
+
I fine-tune the model using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with
|
| 24 |
+
standard parameters.I run a grid search on the evaluation set to find the best hyper-parameter setup. I test different
|
| 25 |
+
values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). I repeat the fine-tuning for each setup five times with different seeds, to avoid getting good results by chance.
|
| 26 |
+
After finding the best model w.r.t the evaluation set, I report the mean result across seeds for that model on the test set.
|
| 27 |
|
| 28 |
### Results
|
| 29 |
|