BARTxiv

See the model implementation here.

This model is a fine-tuned version of facebook/bart-large-cnn on the arxiv-summarization dataset. It achieves the following results on the validation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
1.24	1.0	1073	1.24	38.32	12.80	20.55	34.50
1.04	2.0	2146	1.04	39.65	13.74	21.28	35.83
0.979	3.0	3219	0.98	40.19	14.30	21.87	36.38
0.970	4.0	4292	0.97	40.87	14.44	22.14	36.89
0.918	5.0	5365	0.92	41.17	14.94	22.54	37.40
0.901	6.0	6438	0.90	41.02	14.65	22.46	37.05
0.889	7.0	7511	0.89	41.32	15.09	22.64	37.42
0.900	8.0	8584	0 .90	41.23	15.02	22.67	37.28
0.869	9.0	9657	0.87	41.70	15.13	22.85	37.77

Safetensors

Model size

0.4B params

Tensor type

F32

Quantizations