Model checkpoint for Assignment3

The full code for training procedure, configuration and the training log for the checkpoint model are documented in the IPython notebook accessible in the files

Comparable results of the checkpoint used in assignment3 can be reproduced in Colab using training pipeline in the IPython notebook.

This model is a fine-tuned version of microsoft/deberta-v3-base on climate claim verification training dataset(using gold evidence provided by the training set). It achieves the following results on the development set:

Model evalutaion performance on the development set

F1: 0.7196
Accuracy: 0.7922

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	F1	Accuracy
5.4135	1.0	77	1.3468	0.1532	0.4416
4.6607	2.0	154	1.1471	0.3819	0.6364
4.2591	3.0	231	1.1545	0.3801	0.6234
3.9299	4.0	308	0.9857	0.6322	0.7013
3.2692	5.0	385	0.8877	0.6500	0.7273
2.7183	6.0	462	1.0321	0.6360	0.7403
2.3779	7.0	539	0.9220	0.7017	0.7727
2.1893	8.0	616	0.9742	0.7196	0.7922
1.9169	9.0	693	0.9781	0.7034	0.7857
1.8150	10.0	770	0.9680	0.7035	0.7857

Framework versions

Transformers 5.8.0
Pytorch 2.10.0+cu128
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: 523

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for angela220/out

Base model

microsoft/deberta-v3-base

Finetuned

(623)

this model