results

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.0016
Accuracy: 0.7135
F1: 0.7084

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.2147	0.0424	500	1.4753	0.6069	0.5742
1.024	0.0848	1000	1.4624	0.6169	0.5880
1.3489	0.1273	1500	1.3591	0.6292	0.5975
1.3487	0.1697	2000	1.2964	0.6416	0.6179
1.2584	0.2121	2500	1.2626	0.6419	0.6290
1.2656	0.2545	3000	1.2225	0.6556	0.6334
1.2501	0.2970	3500	1.1955	0.6550	0.6344
1.1692	0.3394	4000	1.1675	0.6656	0.6518
1.1625	0.3818	4500	1.1735	0.6612	0.6471
1.2122	0.4242	5000	1.1384	0.6718	0.6566
1.1813	0.4667	5500	1.1344	0.6720	0.6572
1.1571	0.5091	6000	1.1228	0.6763	0.6666
1.1468	0.5515	6500	1.1067	0.6728	0.6671
1.1663	0.5939	7000	1.0877	0.6800	0.6716
1.0567	0.6363	7500	1.0971	0.6798	0.6725
1.0834	0.6788	8000	1.0802	0.6863	0.6745
1.1045	0.7212	8500	1.0645	0.6871	0.6753
1.0942	0.7636	9000	1.0495	0.6936	0.6827
1.0286	0.8060	9500	1.0579	0.6909	0.6766
1.0633	0.8485	10000	1.0628	0.6845	0.6764
1.0718	0.8909	10500	1.0430	0.6944	0.6858
1.0848	0.9333	11000	1.0288	0.6933	0.6870
1.0124	0.9757	11500	1.0291	0.6946	0.6884
0.8907	1.0182	12000	1.0314	0.6945	0.6878
0.8527	1.0606	12500	1.0173	0.7021	0.6952
0.79	1.1030	13000	1.0402	0.6960	0.6866
0.8419	1.1454	13500	1.0281	0.7004	0.6925
0.8665	1.1878	14000	1.0244	0.7003	0.6938
0.8793	1.2303	14500	1.0221	0.7008	0.6930
0.8335	1.2727	15000	1.0097	0.7012	0.6955
0.8149	1.3151	15500	1.0163	0.7019	0.6955
0.8193	1.3575	16000	1.0248	0.7006	0.6939
0.8453	1.4000	16500	1.0151	0.7025	0.6956
0.8591	1.4424	17000	1.0110	0.7043	0.6945
0.8581	1.4848	17500	1.0132	0.7050	0.6958
0.9052	1.5272	18000	1.0104	0.7036	0.6981
0.8667	1.5697	18500	1.0080	0.7057	0.6970
0.8016	1.6121	19000	1.0098	0.7012	0.6963
0.8507	1.6545	19500	1.0061	0.7044	0.6975
0.8037	1.6969	20000	1.0095	0.7069	0.6985
0.8371	1.7394	20500	1.0007	0.7077	0.6980
0.7558	1.7818	21000	0.9975	0.7035	0.6985
0.7919	1.8242	21500	0.9937	0.7077	0.6998
0.8059	1.8666	22000	0.9900	0.7097	0.7037
0.799	1.9090	22500	0.9918	0.7112	0.7054
0.8072	1.9515	23000	0.9875	0.7098	0.7020
0.8052	1.9939	23500	0.9902	0.7088	0.7017
0.6761	2.0363	24000	1.0025	0.7079	0.7009
0.7107	2.0787	24500	1.0087	0.7108	0.7053
0.667	2.1212	25000	1.0080	0.7090	0.7042
0.6489	2.1636	25500	1.0024	0.7089	0.7035
0.6945	2.2060	26000	1.0097	0.7107	0.7039
0.6609	2.2484	26500	1.0089	0.7092	0.7036
0.6442	2.2909	27000	1.0178	0.7113	0.7037
0.6822	2.3333	27500	1.0124	0.7099	0.7048
0.6677	2.3757	28000	1.0089	0.7089	0.7034
0.6272	2.4181	28500	1.0051	0.7114	0.7062
0.6336	2.4605	29000	1.0110	0.7121	0.7075
0.6247	2.5030	29500	1.0089	0.7106	0.7056
0.6635	2.5454	30000	1.0112	0.7131	0.7077
0.6401	2.5878	30500	1.0092	0.7127	0.7076
0.6488	2.6302	31000	1.0081	0.7115	0.7062
0.64	2.6727	31500	1.0066	0.7124	0.7077
0.6764	2.7151	32000	1.0050	0.7123	0.7077
0.6554	2.7575	32500	1.0062	0.7124	0.7070
0.6239	2.7999	33000	1.0055	0.7128	0.7074
0.669	2.8424	33500	1.0045	0.7129	0.7076
0.6742	2.8848	34000	1.0019	0.7138	0.7084
0.5769	2.9272	34500	1.0017	0.7136	0.7086
0.6783	2.9696	35000	1.0016	0.7135	0.7084

Framework versions

Transformers 4.53.1
Pytorch 2.6.0+cu124
Datasets 2.14.4
Tokenizers 0.21.2

Downloads last month: 1

Safetensors

Model size

67M params

Tensor type

F32

Model tree for ntAnh-dev/news-category-classification

Base model

distilbert/distilbert-base-uncased

Finetuned

(11710)

this model