| F5-TTS trained on Mongolian speech dataset. Epoch was around 33 when I stopped it and (I initially set 100 but too long) and updates were 51000. | |
| Parameters that I know of: | |
| Base model: F5TTS Base | |
| Epochs: 100 | |
| Learning rage: 0.000075 | |
| Max Gradient Norm: 1 | |
| Warmup updates: 57 | |
| Batch Size Type: frame | |
| Batch size per gpu: 1600 (rtx 3080ti) | |
| grad_acc_steps = 1 | |
| max_samples = 64 | |
| precision: fp16 | |
| logger: wandb |