exceptions
Collection
Data and models for "Manipulating language models’ training data to study syntactic constraint learning: the case of English passivization"
•
49 items
•
Updated
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 5.1033 | 0.1076 | 1000 | 5.0180 | 0.2274 |
| 4.5988 | 0.2153 | 2000 | 4.5230 | 0.2689 |
| 4.3355 | 0.3229 | 3000 | 4.2498 | 0.2975 |
| 4.1754 | 0.4305 | 4000 | 4.1035 | 0.3111 |
| 4.0607 | 0.5382 | 5000 | 3.9984 | 0.3212 |
| 4.0022 | 0.6458 | 6000 | 3.9244 | 0.3273 |
| 3.9275 | 0.7534 | 7000 | 3.8683 | 0.3325 |
| 3.8874 | 0.8610 | 8000 | 3.8202 | 0.3373 |
| 3.8623 | 0.9687 | 9000 | 3.7831 | 0.3403 |
| 3.7478 | 1.0763 | 10000 | 3.7542 | 0.3438 |
| 3.7484 | 1.1839 | 11000 | 3.7274 | 0.3462 |
| 3.731 | 1.2916 | 12000 | 3.7030 | 0.3489 |
| 3.7217 | 1.3992 | 13000 | 3.6806 | 0.3509 |
| 3.7223 | 1.5068 | 14000 | 3.6594 | 0.3529 |
| 3.6733 | 1.6145 | 15000 | 3.6423 | 0.3546 |
| 3.6659 | 1.7221 | 16000 | 3.6230 | 0.3564 |
| 3.6659 | 1.8297 | 17000 | 3.6099 | 0.3582 |
| 3.6405 | 1.9374 | 18000 | 3.5942 | 0.3595 |
| 3.5552 | 2.0450 | 19000 | 3.5863 | 0.3607 |
| 3.5517 | 2.1526 | 20000 | 3.5759 | 0.3619 |
| 3.5622 | 2.2603 | 21000 | 3.5652 | 0.3634 |
| 3.5521 | 2.3679 | 22000 | 3.5536 | 0.3646 |
| 3.5272 | 2.4755 | 23000 | 3.5438 | 0.3653 |
| 3.5408 | 2.5831 | 24000 | 3.5349 | 0.3664 |
| 3.5384 | 2.6908 | 25000 | 3.5265 | 0.3676 |
| 3.5453 | 2.7984 | 26000 | 3.5167 | 0.3680 |
| 3.5254 | 2.9060 | 27000 | 3.5096 | 0.3690 |
| 3.4289 | 3.0137 | 28000 | 3.5023 | 0.3698 |
| 3.4573 | 3.1213 | 29000 | 3.5012 | 0.3706 |
| 3.4612 | 3.2289 | 30000 | 3.4919 | 0.3712 |
| 3.4596 | 3.3366 | 31000 | 3.4854 | 0.3723 |
| 3.4761 | 3.4442 | 32000 | 3.4807 | 0.3726 |
| 3.4718 | 3.5518 | 33000 | 3.4740 | 0.3732 |
| 3.4625 | 3.6595 | 34000 | 3.4664 | 0.3741 |
| 3.4482 | 3.7671 | 35000 | 3.4618 | 0.3749 |
| 3.4544 | 3.8747 | 36000 | 3.4568 | 0.3747 |
| 3.4332 | 3.9823 | 37000 | 3.4505 | 0.3760 |
| 3.3729 | 4.0900 | 38000 | 3.4507 | 0.3762 |
| 3.3957 | 4.1976 | 39000 | 3.4467 | 0.3766 |
| 3.4135 | 4.3052 | 40000 | 3.4434 | 0.3771 |
| 3.4065 | 4.4129 | 41000 | 3.4378 | 0.3779 |
| 3.3837 | 4.5205 | 42000 | 3.4326 | 0.3780 |
| 3.3988 | 4.6281 | 43000 | 3.4265 | 0.3786 |
| 3.4003 | 4.7358 | 44000 | 3.4218 | 0.3791 |
| 3.3718 | 4.8434 | 45000 | 3.4193 | 0.3795 |
| 3.3876 | 4.9510 | 46000 | 3.4116 | 0.3805 |
| 3.3072 | 5.0587 | 47000 | 3.4152 | 0.3806 |
| 3.3312 | 5.1663 | 48000 | 3.4141 | 0.3808 |
| 3.33 | 5.2739 | 49000 | 3.4092 | 0.3810 |
| 3.3217 | 5.3816 | 50000 | 3.4062 | 0.3815 |
| 3.3232 | 5.4892 | 51000 | 3.4009 | 0.3818 |
| 3.3414 | 5.5968 | 52000 | 3.3978 | 0.3823 |
| 3.3284 | 5.7044 | 53000 | 3.3928 | 0.3824 |
| 3.3423 | 5.8121 | 54000 | 3.3889 | 0.3831 |
| 3.3375 | 5.9197 | 55000 | 3.3830 | 0.3838 |
| 3.2457 | 6.0273 | 56000 | 3.3864 | 0.3836 |
| 3.2743 | 6.1350 | 57000 | 3.3878 | 0.3837 |
| 3.278 | 6.2426 | 58000 | 3.3834 | 0.3845 |
| 3.2893 | 6.3502 | 59000 | 3.3806 | 0.3846 |
| 3.2858 | 6.4579 | 60000 | 3.3779 | 0.3851 |
| 3.2612 | 6.5655 | 61000 | 3.3731 | 0.3855 |
| 3.2874 | 6.6731 | 62000 | 3.3683 | 0.3860 |
| 3.2913 | 6.7808 | 63000 | 3.3637 | 0.3862 |
| 3.2874 | 6.8884 | 64000 | 3.3623 | 0.3865 |
| 3.3006 | 6.9960 | 65000 | 3.3566 | 0.3871 |
| 3.2113 | 7.1036 | 66000 | 3.3637 | 0.3868 |
| 3.2365 | 7.2113 | 67000 | 3.3613 | 0.3871 |
| 3.2464 | 7.3189 | 68000 | 3.3587 | 0.3875 |
| 3.2246 | 7.4265 | 69000 | 3.3527 | 0.3879 |
| 3.2492 | 7.5342 | 70000 | 3.3493 | 0.3885 |
| 3.2229 | 7.6418 | 71000 | 3.3436 | 0.3890 |
| 3.2475 | 7.7494 | 72000 | 3.3427 | 0.3891 |
| 3.2372 | 7.8571 | 73000 | 3.3380 | 0.3897 |
| 3.256 | 7.9647 | 74000 | 3.3354 | 0.3900 |
| 3.1567 | 8.0723 | 75000 | 3.3419 | 0.3896 |
| 3.1763 | 8.1800 | 76000 | 3.3383 | 0.3903 |
| 3.1703 | 8.2876 | 77000 | 3.3362 | 0.3905 |
| 3.1765 | 8.3952 | 78000 | 3.3342 | 0.3905 |
| 3.1938 | 8.5029 | 79000 | 3.3285 | 0.3911 |
| 3.167 | 8.6105 | 80000 | 3.3272 | 0.3913 |
| 3.208 | 8.7181 | 81000 | 3.3216 | 0.3917 |
| 3.1722 | 8.8257 | 82000 | 3.3193 | 0.3921 |
| 3.1905 | 8.9334 | 83000 | 3.3167 | 0.3924 |
| 3.1372 | 9.0410 | 84000 | 3.3181 | 0.3925 |
| 3.1427 | 9.1486 | 85000 | 3.3179 | 0.3927 |
| 3.1571 | 9.2563 | 86000 | 3.3175 | 0.3929 |
| 3.1387 | 9.3639 | 87000 | 3.3142 | 0.3933 |
| 3.1296 | 9.4715 | 88000 | 3.3115 | 0.3935 |
| 3.1449 | 9.5792 | 89000 | 3.3079 | 0.3939 |
| 3.126 | 9.6868 | 90000 | 3.3078 | 0.3941 |
| 3.1172 | 9.7944 | 91000 | 3.3053 | 0.3943 |
| 3.1328 | 9.9021 | 92000 | 3.3035 | 0.3944 |