End of training

15b5c7a verified 5 months ago

2.92 kB

library_name: transformers
license: apache-2.0
base_model: albert/albert-xlarge-v2
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: 88baaa0fc0d014bbfbf8077c9420beea
    results: []

88baaa0fc0d014bbfbf8077c9420beea

This model is a fine-tuned version of albert/albert-xlarge-v2 on the ccdv/patent-classification [abstract] dataset. It achieves the following results on the evaluation set:

Loss: 1.9926
Data Size: 1.0
Epoch Runtime: 145.0872
Accuracy: 0.2218
F1 Macro: 0.0403

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Accuracy	F1 Macro
No log	0	0	2.4026	0	9.6523	0.1004	0.0203
No log	1	781	2.0320	0.0078	11.3637	0.1538	0.0372
No log	2	1562	2.0092	0.0156	11.9116	0.1510	0.0292
No log	3	2343	2.0049	0.0312	14.1118	0.2071	0.0381
0.0454	4	3124	2.0001	0.0625	18.2314	0.2218	0.0403
2.0154	5	3905	1.9943	0.125	26.8018	0.2218	0.0403
2.0269	6	4686	1.9923	0.25	43.9274	0.2071	0.0381
1.9998	7	5467	1.9855	0.5	77.6442	0.2071	0.0381
1.9763	8.0	6248	1.9884	1.0	145.9947	0.2161	0.0648
1.972	9.0	7029	1.9885	1.0	145.3029	0.2071	0.0381
1.985	10.0	7810	1.9864	1.0	145.0537	0.2218	0.0403
1.9704	11.0	8591	1.9926	1.0	145.0872	0.2218	0.0403

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.3.0
Tokenizers 0.22.1