| ### Contents: | |
| Trained on 2 epochs, 5e-5 learning rate, batch size 16 (final training loss ranges around 0.3, still quite high), took about 1h 30min to train on T4 colab (max 3h) |
| ### Contents: | |
| Trained on 2 epochs, 5e-5 learning rate, batch size 16 (final training loss ranges around 0.3, still quite high), took about 1h 30min to train on T4 colab (max 3h) |