train_wic_1745950295
This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the wic dataset. It achieves the following results on the evaluation set:
- Loss: 3.1765
- Num Input Tokens Seen: 12845616
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 123
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- training_steps: 40000
Training results
| Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
|---|---|---|---|---|
| 3.9958 | 0.1637 | 200 | 3.3872 | 64080 |
| 3.5157 | 0.3275 | 400 | 3.2954 | 128048 |
| 4.3392 | 0.4912 | 600 | 3.2764 | 192224 |
| 3.7731 | 0.6549 | 800 | 3.2530 | 256832 |
| 4.8833 | 0.8187 | 1000 | 3.2281 | 321264 |
| 4.3418 | 0.9824 | 1200 | 3.2445 | 385728 |
| 2.8185 | 1.1457 | 1400 | 3.2374 | 449768 |
| 4.5054 | 1.3095 | 1600 | 3.1879 | 514072 |
| 3.5401 | 1.4732 | 1800 | 3.2108 | 578408 |
| 3.8598 | 1.6369 | 2000 | 3.2438 | 642248 |
| 4.6184 | 1.8007 | 2200 | 3.2463 | 706488 |
| 1.636 | 1.9644 | 2400 | 3.2159 | 770888 |
| 4.2425 | 2.1277 | 2600 | 3.2043 | 835216 |
| 4.4237 | 2.2914 | 2800 | 3.2014 | 899312 |
| 2.0277 | 2.4552 | 3000 | 3.1808 | 963696 |
| 2.9487 | 2.6189 | 3200 | 3.2460 | 1027904 |
| 2.511 | 2.7826 | 3400 | 3.2238 | 1092016 |
| 4.6544 | 2.9464 | 3600 | 3.2033 | 1156240 |
| 2.285 | 3.1097 | 3800 | 3.1921 | 1220568 |
| 2.4531 | 3.2734 | 4000 | 3.2192 | 1285128 |
| 3.4836 | 3.4372 | 4200 | 3.2053 | 1349032 |
| 3.4662 | 3.6009 | 4400 | 3.1977 | 1413096 |
| 3.1351 | 3.7646 | 4600 | 3.2280 | 1477816 |
| 3.7616 | 3.9284 | 4800 | 3.2018 | 1541800 |
| 3.2064 | 4.0917 | 5000 | 3.2130 | 1605480 |
| 3.6003 | 4.2554 | 5200 | 3.2141 | 1669464 |
| 3.1363 | 4.4192 | 5400 | 3.2227 | 1733528 |
| 3.4699 | 4.5829 | 5600 | 3.2105 | 1797608 |
| 2.9962 | 4.7466 | 5800 | 3.2237 | 1862328 |
| 3.6113 | 4.9104 | 6000 | 3.1934 | 1926824 |
| 4.0669 | 5.0737 | 6200 | 3.2136 | 1990752 |
| 3.7169 | 5.2374 | 6400 | 3.2272 | 2055200 |
| 3.803 | 5.4011 | 6600 | 3.2170 | 2119232 |
| 3.9642 | 5.5649 | 6800 | 3.2167 | 2183440 |
| 3.0489 | 5.7286 | 7000 | 3.2061 | 2247920 |
| 3.5582 | 5.8923 | 7200 | 3.2074 | 2312032 |
| 3.1043 | 6.0557 | 7400 | 3.1989 | 2376200 |
| 3.0691 | 6.2194 | 7600 | 3.2181 | 2440472 |
| 2.4881 | 6.3831 | 7800 | 3.2191 | 2504760 |
| 5.8401 | 6.5469 | 8000 | 3.1854 | 2568840 |
| 3.3306 | 6.7106 | 8200 | 3.1947 | 2632776 |
| 2.7831 | 6.8743 | 8400 | 3.2077 | 2697176 |
| 4.6388 | 7.0377 | 8600 | 3.2230 | 2761240 |
| 4.732 | 7.2014 | 8800 | 3.2151 | 2825240 |
| 4.6202 | 7.3651 | 9000 | 3.2434 | 2889368 |
| 3.0046 | 7.5289 | 9200 | 3.1974 | 2953752 |
| 3.1253 | 7.6926 | 9400 | 3.1951 | 3018440 |
| 2.4775 | 7.8563 | 9600 | 3.1978 | 3082552 |
| 1.7311 | 8.0196 | 9800 | 3.2191 | 3146472 |
| 3.4284 | 8.1834 | 10000 | 3.2100 | 3211320 |
| 2.1299 | 8.3471 | 10200 | 3.2070 | 3275192 |
| 2.3182 | 8.5108 | 10400 | 3.1918 | 3339400 |
| 3.3599 | 8.6746 | 10600 | 3.1857 | 3403656 |
| 3.7202 | 8.8383 | 10800 | 3.1957 | 3467848 |
| 4.6498 | 9.0016 | 11000 | 3.2008 | 3531952 |
| 5.2661 | 9.1654 | 11200 | 3.2382 | 3596368 |
| 3.6412 | 9.3291 | 11400 | 3.2045 | 3660496 |
| 1.9489 | 9.4928 | 11600 | 3.2084 | 3724480 |
| 4.1304 | 9.6566 | 11800 | 3.2145 | 3788928 |
| 1.9428 | 9.8203 | 12000 | 3.1894 | 3853296 |
| 2.7573 | 9.9840 | 12200 | 3.2168 | 3917232 |
| 2.7708 | 10.1474 | 12400 | 3.2046 | 3981568 |
| 2.951 | 10.3111 | 12600 | 3.2150 | 4045600 |
| 4.7755 | 10.4748 | 12800 | 3.2154 | 4110048 |
| 3.9557 | 10.6386 | 13000 | 3.1837 | 4174432 |
| 2.7547 | 10.8023 | 13200 | 3.2243 | 4238512 |
| 2.7812 | 10.9660 | 13400 | 3.2060 | 4302800 |
| 3.2587 | 11.1293 | 13600 | 3.1930 | 4366728 |
| 4.2143 | 11.2931 | 13800 | 3.1916 | 4431112 |
| 2.9836 | 11.4568 | 14000 | 3.2197 | 4495320 |
| 2.6835 | 11.6205 | 14200 | 3.2014 | 4559336 |
| 2.1858 | 11.7843 | 14400 | 3.2053 | 4623464 |
| 4.9649 | 11.9480 | 14600 | 3.1945 | 4687880 |
| 2.8991 | 12.1113 | 14800 | 3.1890 | 4752088 |
| 5.8153 | 12.2751 | 15000 | 3.2063 | 4816376 |
| 1.9165 | 12.4388 | 15200 | 3.2002 | 4881000 |
| 2.2727 | 12.6025 | 15400 | 3.2027 | 4944776 |
| 3.1679 | 12.7663 | 15600 | 3.2219 | 5009528 |
| 3.0102 | 12.9300 | 15800 | 3.2150 | 5073448 |
| 4.699 | 13.0933 | 16000 | 3.2290 | 5137696 |
| 2.5314 | 13.2571 | 16200 | 3.2048 | 5202256 |
| 4.6496 | 13.4208 | 16400 | 3.1821 | 5266128 |
| 4.0822 | 13.5845 | 16600 | 3.1788 | 5330256 |
| 4.3593 | 13.7483 | 16800 | 3.2099 | 5395072 |
| 3.9051 | 13.9120 | 17000 | 3.1960 | 5458672 |
| 3.9994 | 14.0753 | 17200 | 3.2103 | 5522480 |
| 2.8361 | 14.2391 | 17400 | 3.2233 | 5586480 |
| 4.9401 | 14.4028 | 17600 | 3.1868 | 5650208 |
| 3.8849 | 14.5665 | 17800 | 3.1857 | 5714704 |
| 3.5166 | 14.7302 | 18000 | 3.2083 | 5779488 |
| 3.9967 | 14.8940 | 18200 | 3.2256 | 5843728 |
| 1.8287 | 15.0573 | 18400 | 3.2158 | 5908152 |
| 2.9093 | 15.2210 | 18600 | 3.2090 | 5972168 |
| 2.1674 | 15.3848 | 18800 | 3.1765 | 6037144 |
| 2.9511 | 15.5485 | 19000 | 3.2208 | 6101800 |
| 4.2766 | 15.7122 | 19200 | 3.2050 | 6165416 |
| 4.3034 | 15.8760 | 19400 | 3.1860 | 6229672 |
| 4.6391 | 16.0393 | 19600 | 3.2082 | 6293504 |
| 1.9051 | 16.2030 | 19800 | 3.2015 | 6357840 |
| 2.2928 | 16.3668 | 20000 | 3.2217 | 6422352 |
| 1.9085 | 16.5305 | 20200 | 3.2065 | 6486352 |
| 3.924 | 16.6942 | 20400 | 3.2151 | 6550928 |
| 3.4126 | 16.8580 | 20600 | 3.2327 | 6615008 |
| 4.5493 | 17.0213 | 20800 | 3.2317 | 6678864 |
| 3.8615 | 17.1850 | 21000 | 3.2040 | 6743040 |
| 4.0426 | 17.3488 | 21200 | 3.2315 | 6807664 |
| 2.6831 | 17.5125 | 21400 | 3.2030 | 6871648 |
| 2.9024 | 17.6762 | 21600 | 3.2017 | 6936048 |
| 3.0366 | 17.8400 | 21800 | 3.1934 | 7000448 |
| 3.8192 | 18.0033 | 22000 | 3.1915 | 7064224 |
| 5.5216 | 18.1670 | 22200 | 3.2047 | 7128848 |
| 3.4714 | 18.3307 | 22400 | 3.1915 | 7192992 |
| 4.8674 | 18.4945 | 22600 | 3.1988 | 7256624 |
| 3.7189 | 18.6582 | 22800 | 3.1974 | 7321520 |
| 2.5776 | 18.8219 | 23000 | 3.2019 | 7385552 |
| 3.5356 | 18.9857 | 23200 | 3.2137 | 7449600 |
| 3.0237 | 19.1490 | 23400 | 3.2293 | 7513504 |
| 3.8673 | 19.3127 | 23600 | 3.1965 | 7577776 |
| 4.9547 | 19.4765 | 23800 | 3.2123 | 7642048 |
| 5.2959 | 19.6402 | 24000 | 3.1930 | 7706720 |
| 5.6198 | 19.8039 | 24200 | 3.2101 | 7770896 |
| 3.0233 | 19.9677 | 24400 | 3.2439 | 7835136 |
| 4.588 | 20.1310 | 24600 | 3.1871 | 7899176 |
| 2.266 | 20.2947 | 24800 | 3.2068 | 7963800 |
| 2.8501 | 20.4585 | 25000 | 3.2414 | 8028584 |
| 3.9682 | 20.6222 | 25200 | 3.2091 | 8092616 |
| 2.3008 | 20.7859 | 25400 | 3.2078 | 8157000 |
| 3.4068 | 20.9497 | 25600 | 3.2114 | 8220920 |
| 2.9892 | 21.1130 | 25800 | 3.2217 | 8284832 |
| 4.825 | 21.2767 | 26000 | 3.1974 | 8348832 |
| 2.818 | 21.4404 | 26200 | 3.2080 | 8412992 |
| 3.9167 | 21.6042 | 26400 | 3.2175 | 8476944 |
| 3.742 | 21.7679 | 26600 | 3.2088 | 8541536 |
| 5.8749 | 21.9316 | 26800 | 3.2172 | 8606128 |
| 1.9666 | 22.0950 | 27000 | 3.1960 | 8670264 |
| 3.3397 | 22.2587 | 27200 | 3.2020 | 8734456 |
| 3.8228 | 22.4224 | 27400 | 3.1910 | 8798776 |
| 3.4253 | 22.5862 | 27600 | 3.2239 | 8862888 |
| 4.1324 | 22.7499 | 27800 | 3.2102 | 8927464 |
| 2.1664 | 22.9136 | 28000 | 3.2067 | 8991912 |
| 3.3336 | 23.0770 | 28200 | 3.2092 | 9055920 |
| 3.3789 | 23.2407 | 28400 | 3.2119 | 9120064 |
| 5.0599 | 23.4044 | 28600 | 3.2146 | 9184496 |
| 3.0677 | 23.5682 | 28800 | 3.2242 | 9248672 |
| 3.5638 | 23.7319 | 29000 | 3.2197 | 9312880 |
| 3.5793 | 23.8956 | 29200 | 3.2128 | 9377264 |
| 3.3808 | 24.0589 | 29400 | 3.1855 | 9441584 |
| 2.9799 | 24.2227 | 29600 | 3.2139 | 9505936 |
| 1.491 | 24.3864 | 29800 | 3.2104 | 9570272 |
| 4.1475 | 24.5501 | 30000 | 3.1904 | 9634480 |
| 5.207 | 24.7139 | 30200 | 3.2083 | 9698784 |
| 3.496 | 24.8776 | 30400 | 3.2235 | 9762800 |
| 5.8899 | 25.0409 | 30600 | 3.1862 | 9826744 |
| 2.6238 | 25.2047 | 30800 | 3.2107 | 9890760 |
| 3.2139 | 25.3684 | 31000 | 3.2059 | 9955112 |
| 3.3207 | 25.5321 | 31200 | 3.1964 | 10019448 |
| 2.3831 | 25.6959 | 31400 | 3.2035 | 10083848 |
| 2.4526 | 25.8596 | 31600 | 3.1851 | 10147752 |
| 3.5566 | 26.0229 | 31800 | 3.2099 | 10211912 |
| 3.7733 | 26.1867 | 32000 | 3.2157 | 10275928 |
| 3.0118 | 26.3504 | 32200 | 3.2315 | 10340168 |
| 6.6009 | 26.5141 | 32400 | 3.2197 | 10404376 |
| 2.9853 | 26.6779 | 32600 | 3.2137 | 10469048 |
| 2.4959 | 26.8416 | 32800 | 3.2112 | 10533640 |
| 2.538 | 27.0049 | 33000 | 3.1927 | 10597888 |
| 2.8235 | 27.1686 | 33200 | 3.2044 | 10662240 |
| 3.1748 | 27.3324 | 33400 | 3.2107 | 10726640 |
| 2.7298 | 27.4961 | 33600 | 3.2253 | 10790608 |
| 5.0767 | 27.6598 | 33800 | 3.2285 | 10854688 |
| 1.4213 | 27.8236 | 34000 | 3.2188 | 10919360 |
| 2.7594 | 27.9873 | 34200 | 3.2144 | 10983664 |
| 3.118 | 28.1506 | 34400 | 3.2237 | 11047464 |
| 3.26 | 28.3144 | 34600 | 3.1957 | 11111848 |
| 2.3447 | 28.4781 | 34800 | 3.2221 | 11176376 |
| 3.8171 | 28.6418 | 35000 | 3.2131 | 11241256 |
| 4.0589 | 28.8056 | 35200 | 3.1951 | 11305112 |
| 3.7132 | 28.9693 | 35400 | 3.2159 | 11369464 |
| 5.1964 | 29.1326 | 35600 | 3.2144 | 11433608 |
| 3.0752 | 29.2964 | 35800 | 3.2138 | 11497944 |
| 3.2964 | 29.4601 | 36000 | 3.2134 | 11562200 |
| 4.1533 | 29.6238 | 36200 | 3.2109 | 11626152 |
| 2.9437 | 29.7876 | 36400 | 3.2126 | 11690824 |
| 1.688 | 29.9513 | 36600 | 3.2131 | 11755016 |
| 2.9703 | 30.1146 | 36800 | 3.2125 | 11818880 |
| 4.8531 | 30.2783 | 37000 | 3.2125 | 11882768 |
| 3.5959 | 30.4421 | 37200 | 3.2125 | 11946912 |
| 3.9701 | 30.6058 | 37400 | 3.2125 | 12011696 |
| 2.3307 | 30.7695 | 37600 | 3.2125 | 12075664 |
| 7.4112 | 30.9333 | 37800 | 3.2125 | 12139680 |
| 3.0752 | 31.0966 | 38000 | 3.2125 | 12204000 |
| 3.0064 | 31.2603 | 38200 | 3.2125 | 12268800 |
| 4.9243 | 31.4241 | 38400 | 3.2125 | 12333024 |
| 4.4955 | 31.5878 | 38600 | 3.2125 | 12396976 |
| 2.4931 | 31.7515 | 38800 | 3.2125 | 12461104 |
| 3.6867 | 31.9153 | 39000 | 3.2125 | 12524768 |
| 1.5921 | 32.0786 | 39200 | 3.2125 | 12588496 |
| 4.4451 | 32.2423 | 39400 | 3.2125 | 12653136 |
| 3.3934 | 32.4061 | 39600 | 3.2125 | 12717328 |
| 2.3516 | 32.5698 | 39800 | 3.2125 | 12781536 |
| 3.6117 | 32.7335 | 40000 | 3.2125 | 12845616 |
Framework versions
- PEFT 0.15.2.dev0
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for rbelanec/train_wic_1745950295
Base model
mistralai/Mistral-7B-v0.3
Finetuned
mistralai/Mistral-7B-Instruct-v0.3