wikidyk-scope-clf-deberta-v3-large-semantic_10_clusters
This model is a fine-tuned version of microsoft/deberta-v3-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1626
- Accuracy: 0.9683
- F1: 0.7271
- Precision: 0.6997
- Recall: 0.7568
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|---|---|---|
| 0.0185 | 1.0 | 902 | 0.1071 | 0.9652 | 0.6044 | 0.8291 | 0.4755 |
| 0.0141 | 2.0 | 1804 | 0.0937 | 0.9710 | 0.7185 | 0.7849 | 0.6624 |
| 0.0105 | 3.0 | 2706 | 0.0980 | 0.9684 | 0.7204 | 0.7115 | 0.7296 |
| 0.0043 | 4.0 | 3608 | 0.1174 | 0.9723 | 0.7422 | 0.7736 | 0.7132 |
| 0.0029 | 5.0 | 4510 | 0.1260 | 0.9704 | 0.7345 | 0.7359 | 0.7332 |
| 0.0006 | 6.0 | 5412 | 0.1351 | 0.9688 | 0.725 | 0.7135 | 0.7368 |
| 0.0007 | 7.0 | 6314 | 0.1525 | 0.9698 | 0.7395 | 0.7133 | 0.7677 |
| 0.0002 | 8.0 | 7216 | 0.1532 | 0.9705 | 0.7436 | 0.7226 | 0.7659 |
| 0.0003 | 9.0 | 8118 | 0.1466 | 0.9730 | 0.7546 | 0.7674 | 0.7423 |
| 0.0001 | 10.0 | 9020 | 0.1626 | 0.9683 | 0.7271 | 0.6997 | 0.7568 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.1
- Tokenizers 0.21.1
- Downloads last month
- -
Model tree for YWZBrandon/wikidyk-scope-clf-deberta-v3-large-semantic_10_clusters
Base model
microsoft/deberta-v3-large