XPU_GEC_t5_char_nepali
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3049
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.441 | 0.1463 | 1000 | 1.3825 |
| 1.3869 | 0.2926 | 2000 | 1.3707 |
| 1.4013 | 0.4389 | 3000 | 1.3782 |
| 1.4017 | 0.5852 | 4000 | 1.3901 |
| 1.4576 | 0.7316 | 5000 | 1.3772 |
| 1.385 | 0.8779 | 6000 | 1.3661 |
| 1.3838 | 1.0243 | 7000 | 1.3771 |
| 1.3816 | 1.1706 | 8000 | 1.3704 |
| 1.3717 | 1.3169 | 9000 | 1.3563 |
| 1.4436 | 1.4632 | 10000 | 1.3743 |
| 1.3712 | 1.6095 | 11000 | 1.3654 |
| 1.3618 | 1.7558 | 12000 | 1.3616 |
| 1.3572 | 1.9022 | 13000 | 1.3543 |
| 1.3633 | 2.0486 | 14000 | 1.3803 |
| 1.3562 | 2.1949 | 15000 | 1.3512 |
| 1.3506 | 2.3412 | 16000 | 1.3588 |
| 1.3506 | 2.4875 | 17000 | 1.3482 |
| 1.3442 | 2.6338 | 18000 | 1.3412 |
| 1.4396 | 2.7801 | 19000 | 1.3690 |
| 1.3469 | 2.9264 | 20000 | 1.3529 |
| 1.3424 | 3.0729 | 21000 | 1.3411 |
| 1.3401 | 3.2192 | 22000 | 1.3394 |
| 1.3395 | 3.3655 | 23000 | 1.3493 |
| 1.3281 | 3.5118 | 24000 | 1.3266 |
| 1.3318 | 3.6581 | 25000 | 1.3248 |
| 1.3233 | 3.8044 | 26000 | 1.3146 |
| 1.3183 | 3.9507 | 27000 | 1.3146 |
| 1.3197 | 4.0972 | 28000 | 1.3140 |
| 1.3158 | 4.2435 | 29000 | 1.3088 |
| 1.4365 | 4.3898 | 30000 | 1.3485 |
| 1.3164 | 4.5361 | 31000 | 1.3098 |
| 1.3095 | 4.6824 | 32000 | 1.3065 |
| 1.3089 | 4.8287 | 33000 | 1.3057 |
| 1.3065 | 4.9750 | 34000 | 1.3049 |
Framework versions
- Transformers 4.50.0.dev0
- Pytorch 2.5.1+cxx11.abi
- Datasets 3.4.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support