| | --- |
| | library_name: transformers |
| | license: mit |
| | base_model: microsoft/deberta-v3-base |
| | tags: |
| | - generated_from_trainer |
| | metrics: |
| | - f1 |
| | - precision |
| | - recall |
| | - accuracy |
| | model-index: |
| | - name: deberta-v3-base-uner-down200 |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # deberta-v3-base-uner-down200 |
| |
|
| | This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset. |
| | It achieves the following results on the evaluation set: |
| | - Loss: 0.1055 |
| | - F1: 0.6677 |
| | - Precision: 0.6249 |
| | - Recall: 0.7168 |
| | - Accuracy: 0.9765 |
| |
|
| | ## Model description |
| |
|
| | More information needed |
| |
|
| | ## Intended uses & limitations |
| |
|
| | More information needed |
| |
|
| | ## Training and evaluation data |
| |
|
| | More information needed |
| |
|
| | ## Training procedure |
| |
|
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 2.5e-05 |
| | - train_batch_size: 16 |
| | - eval_batch_size: 32 |
| | - seed: 42 |
| | - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| | - lr_scheduler_type: linear |
| | - num_epochs: 30 |
| | |
| | ### Training results |
| | |
| | | Training Loss | Epoch | Step | Validation Loss | F1 | Precision | Recall | Accuracy | |
| | |:-------------:|:-------:|:----:|:---------------:|:------:|:---------:|:------:|:--------:| |
| | | 0.0587 | 1.5385 | 20 | 0.2010 | 0.0 | 0.0 | 0.0 | 0.9422 | |
| | | 0.0916 | 3.0769 | 40 | 0.1425 | 0.2535 | 0.3108 | 0.2141 | 0.9535 | |
| | | 0.0571 | 4.6154 | 60 | 0.1251 | 0.3194 | 0.3126 | 0.3265 | 0.9594 | |
| | | 0.0289 | 6.1538 | 80 | 0.1168 | 0.4129 | 0.3915 | 0.4368 | 0.9651 | |
| | | 0.0144 | 7.6923 | 100 | 0.1160 | 0.4527 | 0.4473 | 0.4584 | 0.9671 | |
| | | 0.0041 | 9.2308 | 120 | 0.1040 | 0.5464 | 0.4822 | 0.6303 | 0.9713 | |
| | | 0.0036 | 10.7692 | 140 | 0.1011 | 0.5780 | 0.5410 | 0.6205 | 0.9728 | |
| | | 0.0026 | 12.3077 | 160 | 0.1000 | 0.6219 | 0.5760 | 0.6757 | 0.9743 | |
| | | 0.0025 | 13.8462 | 180 | 0.1004 | 0.6285 | 0.5867 | 0.6768 | 0.9749 | |
| | | 0.003 | 15.3846 | 200 | 0.1018 | 0.6373 | 0.5989 | 0.6811 | 0.9753 | |
| | | 0.0032 | 16.9231 | 220 | 0.1034 | 0.6484 | 0.6117 | 0.6897 | 0.9758 | |
| | | 0.0022 | 18.4615 | 240 | 0.1035 | 0.6529 | 0.6105 | 0.7016 | 0.9760 | |
| | | 0.0002 | 20.0 | 260 | 0.1044 | 0.6552 | 0.6154 | 0.7005 | 0.9763 | |
| | | 0.0021 | 21.5385 | 280 | 0.1046 | 0.6629 | 0.6240 | 0.7070 | 0.9767 | |
| | | 0.0029 | 23.0769 | 300 | 0.1041 | 0.6643 | 0.6215 | 0.7135 | 0.9764 | |
| | | 0.0043 | 24.6154 | 320 | 0.1046 | 0.6663 | 0.6234 | 0.7157 | 0.9765 | |
| | | 0.0012 | 26.1538 | 340 | 0.1052 | 0.6673 | 0.6251 | 0.7157 | 0.9765 | |
| | | 0.0014 | 27.6923 | 360 | 0.1054 | 0.6680 | 0.6255 | 0.7168 | 0.9765 | |
| | | 0.0009 | 29.2308 | 380 | 0.1055 | 0.6677 | 0.6249 | 0.7168 | 0.9765 | |
| | |
| | |
| | ### Framework versions |
| | |
| | - Transformers 4.57.1 |
| | - Pytorch 2.8.0+cu128 |
| | - Datasets 4.3.0 |
| | - Tokenizers 0.22.1 |
| | |