bcc-arbnaskh-audio-aligned-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0666

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0763 20.0 1000 0.0587
0.0664 40.0 2000 0.0518
0.0575 60.0 3000 0.0504
0.0559 80.0 4000 0.0505
0.0492 100.0 5000 0.0524
0.0463 120.0 6000 0.0543
0.0445 140.0 7000 0.0536
0.0456 160.0 8000 0.0549
0.0413 180.0 9000 0.0554
0.0418 200.0 10000 0.0557
0.0385 220.0 11000 0.0571
0.0376 240.0 12000 0.0577
0.039 260.0 13000 0.0593
0.0369 280.0 14000 0.0581
0.0343 300.0 15000 0.0592
0.0367 320.0 16000 0.0608
0.0333 340.0 17000 0.0603
0.0324 360.0 18000 0.0613
0.0336 380.0 19000 0.0614
0.0344 400.0 20000 0.0630
0.0324 420.0 21000 0.0625
0.0317 440.0 22000 0.0639
0.0307 460.0 23000 0.0649
0.0306 480.0 24000 0.0647
0.0303 500.0 25000 0.0633
0.0331 520.0 26000 0.0664
0.0299 540.0 27000 0.0645
0.0286 560.0 28000 0.0640
0.0287 580.0 29000 0.0644
0.0281 600.0 30000 0.0658
0.0285 620.0 31000 0.0660
0.0285 640.0 32000 0.0653
0.0365 660.0 33000 0.0663
0.0278 680.0 34000 0.0654
0.0275 700.0 35000 0.0663
0.0301 720.0 36000 0.0658
0.0279 740.0 37000 0.0658
0.0297 760.0 38000 0.0657
0.0285 780.0 39000 0.0661
0.029 800.0 40000 0.0666

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.2
Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sil-ai/bcc-arbnaskh-audio-aligned-speecht5

Finetuned
(1365)
this model