Update README.md
Browse files
README.md
CHANGED
|
@@ -78,7 +78,7 @@ All texts were cleaned to remove some frequent formatting errors present in the
|
|
| 78 |
64 % of the texts (2688) were used for training, 16 % (672) for validation and 20 % (840) for testing.
|
| 79 |
The texts were tokenized using a WordPiece tokenizer corresponding to the model (with a vocabulary size of 31,102, without lower casing, with padding and truncation).
|
| 80 |
The model was then fine-tuned using TensorFlow on two NVIDIA Tesla V100-SXM2-32GB GPUs on the [bwUniCluster 2.0](https://wiki.bwhpc.de/e/BwUniCluster2.0).
|
| 81 |
-
The learning rate was chosen after
|
| 82 |
For the final model, all texts from the training and validation set (3360 texts) were used for training.
|
| 83 |
|
| 84 |
### Training hyperparameters
|
|
|
|
| 78 |
64 % of the texts (2688) were used for training, 16 % (672) for validation and 20 % (840) for testing.
|
| 79 |
The texts were tokenized using a WordPiece tokenizer corresponding to the model (with a vocabulary size of 31,102, without lower casing, with padding and truncation).
|
| 80 |
The model was then fine-tuned using TensorFlow on two NVIDIA Tesla V100-SXM2-32GB GPUs on the [bwUniCluster 2.0](https://wiki.bwhpc.de/e/BwUniCluster2.0).
|
| 81 |
+
The learning rate was chosen after comparing three values (5e-6, 1e-5, 2e-5) to optimize accuracy in the validation set.
|
| 82 |
For the final model, all texts from the training and validation set (3360 texts) were used for training.
|
| 83 |
|
| 84 |
### Training hyperparameters
|