Update Hypers
Browse files
README.md
CHANGED
|
@@ -173,6 +173,18 @@ More information needed
|
|
| 173 |
### Training data
|
| 174 |
Custom synthetic
|
| 175 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
### Training results
|
| 177 |
|
| 178 |
| Training Loss | Epoch | Step | Validation Loss |
|
|
|
|
| 173 |
### Training data
|
| 174 |
Custom synthetic
|
| 175 |
|
| 176 |
+
### Training hyperparameters
|
| 177 |
+
|
| 178 |
+
The following hyperparameters were used during training:
|
| 179 |
+
- learning_rate: 3e-05
|
| 180 |
+
- train_batch_size: 10
|
| 181 |
+
- eval_batch_size: 3
|
| 182 |
+
- distributed_type: multi-GPU
|
| 183 |
+
- num_devices: 2
|
| 184 |
+
- optimizer: Adam 8bit
|
| 185 |
+
- lr_scheduler_type: linear
|
| 186 |
+
- num_epochs: 4
|
| 187 |
+
|
| 188 |
### Training results
|
| 189 |
|
| 190 |
| Training Loss | Epoch | Step | Validation Loss |
|