model / README.md

improz4

MyLogBert

71bc267 over 2 years ago

preview code

raw

history blame

6.33 kB

metadata

tags:
  - generated_from_trainer
model-index:
  - name: model
    results: []

model

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1795

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
0.639	1.0	125	0.3439
0.365	2.0	250	0.2767
0.3371	3.0	375	0.2818
0.3215	4.0	500	0.2573
0.2982	5.0	625	0.2449
0.2912	6.0	750	0.2377
0.2886	7.0	875	0.2535
0.2853	8.0	1000	0.2426
0.2814	9.0	1125	0.2532
0.272	10.0	1250	0.2322
0.2774	11.0	1375	0.2395
0.2692	12.0	1500	0.2360
0.2633	13.0	1625	0.2373
0.256	14.0	1750	0.2400
0.2512	15.0	1875	0.2082
0.2502	16.0	2000	0.2222
0.2573	17.0	2125	0.2183
0.2493	18.0	2250	0.2073
0.2562	19.0	2375	0.2220
0.2462	20.0	2500	0.2305
0.2495	21.0	2625	0.2098
0.2486	22.0	2750	0.2149
0.2493	23.0	2875	0.2193
0.237	24.0	3000	0.2148
0.2421	25.0	3125	0.2141
0.2391	26.0	3250	0.2252
0.2463	27.0	3375	0.2304
0.238	28.0	3500	0.2062
0.2346	29.0	3625	0.2111
0.235	30.0	3750	0.2055
0.2316	31.0	3875	0.2091
0.2385	32.0	4000	0.2135
0.2296	33.0	4125	0.2124
0.225	34.0	4250	0.2078
0.2343	35.0	4375	0.2069
0.2252	36.0	4500	0.2068
0.2279	37.0	4625	0.2187
0.223	38.0	4750	0.1986
0.2255	39.0	4875	0.1970
0.2261	40.0	5000	0.1994
0.2203	41.0	5125	0.2056
0.2223	42.0	5250	0.1934
0.2259	43.0	5375	0.2028
0.2103	44.0	5500	0.2020
0.2199	45.0	5625	0.1959
0.2211	46.0	5750	0.2017
0.2235	47.0	5875	0.1948
0.2145	48.0	6000	0.2015
0.2116	49.0	6125	0.2114
0.216	50.0	6250	0.1966
0.216	51.0	6375	0.2107
0.2121	52.0	6500	0.1939
0.2099	53.0	6625	0.2167
0.2069	54.0	6750	0.1909
0.2129	55.0	6875	0.2070
0.2105	56.0	7000	0.1927
0.2092	57.0	7125	0.1957
0.2043	58.0	7250	0.2064
0.2072	59.0	7375	0.1915
0.2069	60.0	7500	0.1972
0.2009	61.0	7625	0.1977
0.2001	62.0	7750	0.1853
0.2058	63.0	7875	0.1976
0.1944	64.0	8000	0.1887
0.1985	65.0	8125	0.1932
0.2061	66.0	8250	0.1924
0.2028	67.0	8375	0.1888
0.2016	68.0	8500	0.1967
0.2	69.0	8625	0.1983
0.201	70.0	8750	0.1862
0.1975	71.0	8875	0.1943
0.2035	72.0	9000	0.1856
0.2008	73.0	9125	0.1957
0.1966	74.0	9250	0.1843
0.1945	75.0	9375	0.1864
0.1989	76.0	9500	0.1876
0.195	77.0	9625	0.1867
0.1984	78.0	9750	0.2041
0.1974	79.0	9875	0.1996
0.1918	80.0	10000	0.1694
0.1959	81.0	10125	0.1780
0.1965	82.0	10250	0.1738
0.1917	83.0	10375	0.2058
0.1911	84.0	10500	0.1806
0.1905	85.0	10625	0.1852
0.1934	86.0	10750	0.1765
0.1874	87.0	10875	0.1858
0.1921	88.0	11000	0.1790
0.1883	89.0	11125	0.1933
0.1854	90.0	11250	0.1785
0.189	91.0	11375	0.1849
0.1873	92.0	11500	0.1880
0.1866	93.0	11625	0.1758
0.1909	94.0	11750	0.1783
0.1821	95.0	11875	0.1936
0.1868	96.0	12000	0.1954
0.1846	97.0	12125	0.1793
0.1847	98.0	12250	0.1866
0.1877	99.0	12375	0.1677
0.1845	100.0	12500	0.1718

Framework versions

Transformers 4.33.2
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3