ssc-ruc-mms-model-mix-adapt-max2
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6331
- Cer: 0.1560
- Wer: 0.6447
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 1
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 0.569 | 0.2972 | 200 | 0.6670 | 0.1639 | 0.6734 |
| 0.4794 | 0.5944 | 400 | 0.6744 | 0.1675 | 0.6784 |
| 0.4724 | 0.8915 | 600 | 0.6625 | 0.1647 | 0.6727 |
| 0.5633 | 1.1887 | 800 | 0.6371 | 0.1599 | 0.6703 |
| 0.4392 | 1.4859 | 1000 | 0.6562 | 0.1571 | 0.6595 |
| 0.4655 | 1.7831 | 1200 | 0.6475 | 0.1598 | 0.6661 |
| 0.4488 | 2.0802 | 1400 | 0.6615 | 0.1599 | 0.6543 |
| 0.4902 | 2.3774 | 1600 | 0.6597 | 0.1618 | 0.6736 |
| 0.4058 | 2.6746 | 1800 | 0.6560 | 0.1579 | 0.6531 |
| 0.473 | 2.9718 | 2000 | 0.6301 | 0.1581 | 0.6565 |
| 0.3782 | 3.2689 | 2200 | 0.6451 | 0.1570 | 0.6524 |
| 0.4786 | 3.5661 | 2400 | 0.6349 | 0.1571 | 0.6499 |
| 0.5757 | 3.8633 | 2600 | 0.6319 | 0.1552 | 0.6456 |
| 0.4124 | 4.1605 | 2800 | 0.6357 | 0.1576 | 0.6536 |
| 0.4385 | 4.4577 | 3000 | 0.6356 | 0.1574 | 0.6467 |
| 0.3953 | 4.7548 | 3200 | 0.6331 | 0.1560 | 0.6447 |
Framework versions
- Transformers 4.52.1
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.21.4
- Downloads last month
- -