SpeechT5_TTS_Hataw

This model is a fine-tuned version of microsoft/speecht5_tts on the HatawTTS dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 10000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.5165	0.1612	200	0.4500
0.4842	0.3225	400	0.4352
0.4688	0.4837	600	0.4178
0.4588	0.6449	800	0.4199
0.4427	0.8061	1000	0.4056
0.4483	0.9674	1200	0.4426
0.4347	1.1282	1400	0.3944
0.4266	1.2894	1600	0.3933
0.4311	1.4506	1800	0.3933
0.4198	1.6119	2000	0.3853
0.4129	1.7731	2200	0.3824
0.4255	1.9343	2400	0.3924
0.4137	2.0951	2600	0.3800
0.406	2.2563	2800	0.3772
0.4098	2.4176	3000	0.3765
0.4084	2.5788	3200	0.3728
0.4033	2.7400	3400	0.3723
0.4057	2.9012	3600	0.3750
0.4055	3.0621	3800	0.3718
0.4013	3.2233	4000	0.3688
0.3989	3.3845	4200	0.3667
0.4016	3.5457	4400	0.3683
0.4023	3.7070	4600	0.3669
0.3969	3.8682	4800	0.3636
0.4027	4.0290	5000	0.3624
0.3937	4.1902	5200	0.3652
0.3931	4.3515	5400	0.3608
0.393	4.5127	5600	0.3624
0.3969	4.6739	5800	0.3587
0.3879	4.8351	6000	0.3580
0.389	4.9964	6200	0.3582
0.3884	5.1572	6400	0.3565
0.3885	5.3184	6600	0.3574
0.3827	5.4796	6800	0.3618
0.3843	5.6409	7000	0.3532
0.3816	5.8021	7200	0.3543
0.3826	5.9633	7400	0.3555
0.3835	6.1241	7600	0.3532
0.3778	6.2854	7800	0.3520
0.3761	6.4466	8000	0.3506
0.3745	6.6078	8200	0.3504
0.3816	6.7690	8400	0.3500
0.3784	6.9303	8600	0.3502
0.3757	7.0911	8800	0.3500
0.3756	7.2523	9000	0.3492
0.3758	7.4135	9200	0.3488
0.3781	7.5748	9400	0.3505
0.375	7.7360	9600	0.3471
0.3796	7.8972	9800	0.3468
0.3975	8.0580	10000	0.3476

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(1303)

this model