mui-muiNT-audio-aligned-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0432

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0647 5.4966 1000 0.0475
0.0527 10.9931 2000 0.0442
0.0516 16.4855 3000 0.0443
0.0541 21.9821 4000 0.0433
0.0478 27.4745 5000 0.0419
0.0479 32.9710 6000 0.0423
0.0456 38.4634 7000 0.0427
0.0445 43.96 8000 0.0452
0.0428 49.4524 9000 0.0416
0.044 54.9490 10000 0.0425
0.0434 60.4414 11000 0.0425
0.0411 65.9379 12000 0.0433
0.0417 71.4303 13000 0.0428
0.0409 76.9269 14000 0.0419
0.0414 82.4193 15000 0.0438
0.0416 87.9159 16000 0.0435
0.0379 93.4083 17000 0.0427
0.0395 98.9048 18000 0.0432
0.0394 104.3972 19000 0.0431
0.038 109.8938 20000 0.0427
0.0355 115.3862 21000 0.0427
0.0374 120.8828 22000 0.0426
0.0348 126.3752 23000 0.0427
0.0348 131.8717 24000 0.0429
0.0357 137.3641 25000 0.0424
0.0356 142.8607 26000 0.0429
0.0351 148.3531 27000 0.0435
0.0341 153.8497 28000 0.0431
0.034 159.3421 29000 0.0429
0.0341 164.8386 30000 0.0429
0.0333 170.3310 31000 0.0434
0.0334 175.8276 32000 0.0431
0.034 181.32 33000 0.0433
0.0335 186.8166 34000 0.0432
0.0332 192.3090 35000 0.0430
0.0332 197.8055 36000 0.0431
0.0331 203.2979 37000 0.0428
0.0339 208.7945 38000 0.0430
0.0333 214.2869 39000 0.0433
0.0337 219.7834 40000 0.0432

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.2
Downloads last month
143
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sil-ai/mui-muiNT-audio-aligned-speecht5

Finetuned
(1351)
this model