XLM-roberta-large-ftit-emb-lr01

This model is a fine-tuned version of Zamza/XLM-roberta-large-ftit-emb-4 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4811

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 22
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6925	0.3026	10000	0.6256
0.6245	0.6052	20000	0.5774
0.6195	0.9079	30000	0.5596
0.6407	1.2105	40000	0.6097
0.6424	1.5131	50000	0.5653
0.6288	1.8157	60000	0.5666
0.5876	2.1183	70000	0.5434
0.5847	2.4209	80000	0.5424
0.5846	2.7236	90000	0.5644
0.5804	3.0262	100000	0.5419
0.5684	3.3288	110000	0.5305
0.5763	3.6314	120000	0.5350
0.5819	3.9340	130000	0.5270
0.5584	4.2367	140000	0.5296
0.5752	4.5393	150000	0.5318
0.5554	4.8419	160000	0.5205
0.5682	5.1445	170000	0.5303
0.5414	5.4471	180000	0.5199
0.5427	5.7497	190000	0.5101
0.5471	6.0525	200000	0.5161
0.5687	6.3552	210000	0.5159
0.5405	6.6578	220000	0.5229
0.5463	6.9604	230000	0.5193
0.5412	7.2630	240000	0.5147
0.5336	7.5656	250000	0.5097
0.5377	7.8683	260000	0.5032
0.5443	8.1709	270000	0.5103
0.5261	8.4735	280000	0.5069
0.5339	8.7761	290000	0.5056
0.5434	9.0787	300000	0.5048
0.5379	9.3813	310000	0.5016
0.527	9.6840	320000	0.5052
0.5446	9.9866	330000	0.5066
0.5351	10.2892	340000	0.4997
0.536	10.5918	350000	0.4956
0.5215	10.8944	360000	0.4969
0.5311	11.1970	370000	0.5092
0.5221	11.4997	380000	0.4936
0.5295	11.8024	390000	0.4897
0.5173	12.1051	400000	0.4980
0.5164	12.4077	410000	0.4858
0.5185	12.7103	420000	0.4967
0.5125	13.0129	430000	0.4973
0.5216	13.3155	440000	0.4900
0.5133	13.6182	450000	0.4878
0.5195	13.9208	460000	0.4938
0.5163	14.2234	470000	0.4940
0.5008	14.5260	480000	0.4925
0.5144	14.8286	490000	0.4885
0.5265	15.1312	500000	0.4925
0.5102	15.4339	510000	0.4957
0.5076	15.7365	520000	0.4923
0.5156	16.0391	530000	0.5032
0.5236	16.3417	540000	0.4974
0.5168	16.6443	550000	0.4826
0.4977	16.9470	560000	0.4860
0.5102	17.2496	570000	0.4889
0.4992	17.5523	580000	0.4789
0.516	17.8550	590000	0.4967
0.5018	18.1576	600000	0.4899
0.5094	18.4602	610000	0.4881
0.4991	18.7629	620000	0.4861
0.4955	19.0655	630000	0.4809
0.4965	19.3681	640000	0.4871
0.4937	19.6707	650000	0.4836
0.5048	19.9733	660000	0.4955
0.5019	20.2759	670000	0.4765
0.4912	20.5786	680000	0.4911
0.4891	20.8812	690000	0.4837
0.5084	21.1838	700000	0.4931
0.4945	21.4864	710000	0.4825
0.5014	21.7890	720000	0.4811

Framework versions

Transformers 4.48.3
Pytorch 2.5.1+cu124
Datasets 3.3.2
Tokenizers 0.21.0

Downloads last month: 57

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for Zamza/XLM-roberta-large-ftit-emb-lr01

Base model

Zamza/XLM-roberta-large-ftit-emb-4

Finetuned

(1)

this model

Finetunes

3 models