xnr-tts-training-data-speecht5-a

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 4000
training_steps: 40000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.0857	4.1155	1000	0.0612
0.0726	8.2309	2000	0.0541
0.0693	12.3464	3000	0.0516
0.0642	16.4619	4000	0.0524
0.0635	20.5773	5000	0.0503
0.0634	24.6928	6000	0.0511
0.0585	28.8082	7000	0.0507
0.0596	32.9237	8000	0.0502
0.053	37.0371	9000	0.0484
0.0519	41.1526	10000	0.0488
0.051	45.2680	11000	0.0479
0.0497	49.3835	12000	0.0478
0.0494	53.4990	13000	0.0486
0.0481	57.6144	14000	0.0485
0.0479	61.7299	15000	0.0479
0.0463	65.8454	16000	0.0486
0.0464	69.9608	17000	0.0490
0.0455	74.0742	18000	0.0486
0.0464	78.1897	19000	0.0483
0.0433	82.3052	20000	0.0477
0.0445	86.4206	21000	0.0488
0.0424	90.5361	22000	0.0473
0.0456	94.6515	23000	0.0477
0.0414	98.7670	24000	0.0475
0.0421	102.8825	25000	0.0484
0.0399	106.9979	26000	0.0475
0.0424	111.1113	27000	0.0478
0.0403	115.2268	28000	0.0475
0.0389	119.3423	29000	0.0475
0.0389	123.4577	30000	0.0469
0.0382	127.5732	31000	0.0471
0.0388	131.6887	32000	0.0468
0.0386	135.8041	33000	0.0477
0.0384	139.9196	34000	0.0476
0.0385	144.0330	35000	0.0470
0.0375	148.1485	36000	0.0472
0.038	152.2639	37000	0.0473
0.0394	156.3794	38000	0.0472
0.0397	160.4948	39000	0.0472
0.0392	164.6103	40000	0.0471

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(1289)

this model