scherrmann
/

GermanFinBert_FP_QuAD

Question Answering

Model card Files Files and versions

scherrmann commited on Nov 17, 2023

Commit

d280b81

·

1 Parent(s): ff1be4a

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -20,10 +20,10 @@ This model is the [further-pretrained version of German FinBERT](https://hugging
 ### Fine-tuning
-I fine-tune all models on all downstream tasks using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with
-standard parameters. For every model, I run a separate grid search on the respective evaluation set for each task to find the best hyper-parameter setup. I test different
-values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). After that, I report the results for all models on the respective
-test set, using the tuned hyper-parameters.
 ### Results

 ### Fine-tuning
+I fine-tune the model using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with
+standard parameters.I run a grid search on the evaluation set to find the best hyper-parameter setup. I test different
+values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). I repeat the fine-tuning for each setup five times with different seeds, to avoid getting good results by chance.
+After finding the best model w.r.t the evaluation set, I report the mean result across seeds for that model on the test set.
 ### Results