Experiments
Collection
A collection of (probably useless) model experiments, where I tinker with existing architectures or sometimes try my hand at something completely new.
•
2 items
•
Updated
This model is a fine-tuned version of sbartlett97/gqa-opus-mt-de-en on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|---|---|---|---|---|---|
| 1.5403 | 1.0 | 62500 | 1.9670 | 0.2318 | 12.8925 |
| 1.5017 | 2.0 | 125000 | 1.9568 | 0.2285 | 12.8835 |
| 1.4314 | 3.0 | 187500 | 1.9667 | 0.2227 | 12.863 |
| 1.4056 | 4.0 | 250000 | 1.9781 | 0.2306 | 12.9815 |
| 1.3433 | 5.0 | 312500 | 2.0043 | 0.2312 | 12.937 |
| 1.3228 | 6.0 | 375000 | 2.0210 | 0.2283 | 12.9465 |
| 1.2224 | 7.0 | 437500 | 2.0480 | 0.2284 | 12.9285 |
| 1.2103 | 8.0 | 500000 | 2.0813 | 0.2305 | 12.901 |
| 1.1647 | 9.0 | 562500 | 2.1105 | 0.2288 | 12.93 |
| 1.1148 | 10.0 | 625000 | 2.1348 | 0.2289 | 12.989 |
| 1.1014 | 11.0 | 687500 | 2.1448 | 0.2282 | 12.924 |
| 1.0690 | 12.0 | 750000 | 2.1797 | 0.2258 | 12.9675 |
| 0.9994 | 13.0 | 812500 | 2.2147 | 0.2272 | 12.996 |
| 0.9748 | 14.0 | 875000 | 2.2385 | 0.2251 | 12.9535 |
| 0.9504 | 15.0 | 937500 | 2.2863 | 0.224 | 12.975 |
| 0.9147 | 16.0 | 1000000 | 2.3162 | 0.2241 | 12.9705 |
| 0.8565 | 17.0 | 1062500 | 2.3561 | 0.2272 | 13.012 |
| 0.8204 | 18.0 | 1125000 | 2.3846 | 0.2273 | 13.0055 |
| 0.7785 | 19.0 | 1187500 | 2.4334 | 0.2217 | 12.991 |
| 0.7603 | 20.0 | 1250000 | 2.4639 | 0.2237 | 13.0475 |
| 0.7153 | 21.0 | 1312500 | 2.5014 | 0.2213 | 13.051 |
| 0.6761 | 22.0 | 1375000 | 2.5300 | 0.2216 | 13.0385 |
| 0.6352 | 23.0 | 1437500 | 2.5624 | 0.2219 | 13.078 |
| 0.6079 | 24.0 | 1500000 | 2.5957 | 0.2245 | 13.087 |
| 0.5723 | 25.0 | 1562500 | 2.6134 | 0.2252 | 13.0925 |
Unable to build the model tree, the base model loops to the model itself. Learn more.