fine-tuned-deberta
Collection
2 items
•
Updated
Best checkpoint by eval accuracy from job 2463191.
3-way NLI / claim verification with labels: supports, refutes, nei.
MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli
Processed datasets (sizes)
| Dataset | Train | Validation |
|---|---|---|
| fever | 94,616 | 11,882 |
| vitaminc | 370,653 | 63,054 |
| hover | 17,155 | 2,144 |
| averitec | 2,872 | 462 |
| climatecheck | 2,722 | 301 |
| Total | 488,018 | 77,843 |
Label distribution (combined)
| Split | supports | refutes | nei |
|---|---|---|---|
| Train | 262,189 | 165,798 | 60,031 |
| Validation | 38,207 | 29,733 | 9,903 |
| Climate eval | 139 | 45 | 117 |
num_train_epochs: 3.0per_device_train_batch_size: 32per_device_eval_batch_size: 8gradient_accumulation_steps: 1learning_rate: 2e-06warmup_steps: 200weight_decay: 0.01lr_scheduler_type: linearmax_length: 320eval_steps: 2000save_steps: 2000seed: 1234gpu: 1x NVIDIA H100 (slurm partition H100)cpus: 6memory: 64GLast eval at step 26000 (epoch 1.7048062422136252).
epoch: 1.7048062422136252eval_accuracy: 0.9208278200994309eval_f1_micro: 0.9208278200994309eval_loss: 0.2498868852853775eval_macro_f1: 0.8856652666528996eval_macro_precision: 0.8993871009204719eval_macro_recall: 0.8746118748467642eval_nei_f1: 0.782261113811916eval_nei_precision: 0.8409988385598142eval_nei_recall: 0.7311925679087146eval_refutes_f1: 0.9199900091582716eval_refutes_precision: 0.9110546797704637eval_refutes_recall: 0.9291023441966838eval_runtime: 198.7966eval_samples_per_second: 391.571eval_steps_per_second: 48.95eval_supports_f1: 0.9547446769885111eval_supports_precision: 0.9461077844311377eval_supports_recall: 0.9635407124348941step: 26000