contemmcm
/

d4642015f9fa8db06d31232a6745c19f

Text Classification

Generated from Trainer

text-embeddings-inference

Model card Files Files and versions

d4642015f9fa8db06d31232a6745c19f / README.md

contemmcm's picture

End of training

ba4ccd2 verified 5 months ago

|

history blame contribute delete

3.68 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: distilbert/distilroberta-base
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- rouge
	model-index:
	- name: d4642015f9fa8db06d31232a6745c19f
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# d4642015f9fa8db06d31232a6745c19f

	This model is a fine-tuned version of [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base) on the google/boolq dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.9388
	- Data Size: 1.0
	- Epoch Runtime: 13.8599
	- Accuracy: 0.7215
	- F1 Macro: 0.7022
	- Rouge1: 0.7218
	- Rouge2: 0.0
	- Rougel: 0.7209
	- Rougelsum: 0.7209

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- total_train_batch_size: 32
	- total_eval_batch_size: 32
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: constant
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Data Size \| Epoch Runtime \| Accuracy \| F1 Macro \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------:\|:-------------:\|:--------:\|:--------:\|:------:\|:------:\|:------:\|:---------:\|
	\| No log \| 0 \| 0 \| 0.7378 \| 0 \| 2.0385 \| 0.3787 \| 0.2747 \| 0.3787 \| 0.0 \| 0.3793 \| 0.3790 \|
	\| No log \| 1 \| 294 \| 0.6905 \| 0.0078 \| 2.6280 \| 0.5643 \| 0.5106 \| 0.5650 \| 0.0 \| 0.5646 \| 0.5646 \|
	\| No log \| 2 \| 588 \| 0.6615 \| 0.0156 \| 2.5438 \| 0.6213 \| 0.3832 \| 0.6213 \| 0.0 \| 0.6207 \| 0.6210 \|
	\| No log \| 3 \| 882 \| 0.6658 \| 0.0312 \| 2.8860 \| 0.6198 \| 0.3857 \| 0.6201 \| 0.0 \| 0.6195 \| 0.6198 \|
	\| 0.027 \| 4 \| 1176 \| 0.6602 \| 0.0625 \| 3.4238 \| 0.6213 \| 0.3832 \| 0.6213 \| 0.0 \| 0.6207 \| 0.6210 \|
	\| 0.0552 \| 5 \| 1470 \| 0.6547 \| 0.125 \| 4.1016 \| 0.6213 \| 0.3832 \| 0.6213 \| 0.0 \| 0.6207 \| 0.6210 \|
	\| 0.0935 \| 6 \| 1764 \| 0.6467 \| 0.25 \| 5.5563 \| 0.6471 \| 0.5745 \| 0.6471 \| 0.0 \| 0.6468 \| 0.6474 \|
	\| 0.6078 \| 7 \| 2058 \| 0.6134 \| 0.5 \| 8.3762 \| 0.6838 \| 0.6222 \| 0.6841 \| 0.0 \| 0.6829 \| 0.6841 \|
	\| 0.5437 \| 8.0 \| 2352 \| 0.5814 \| 1.0 \| 14.4086 \| 0.6952 \| 0.6273 \| 0.6955 \| 0.0 \| 0.6949 \| 0.6953 \|
	\| 0.4424 \| 9.0 \| 2646 \| 0.6237 \| 1.0 \| 13.7459 \| 0.7255 \| 0.6899 \| 0.7255 \| 0.0 \| 0.7255 \| 0.7258 \|
	\| 0.293 \| 10.0 \| 2940 \| 0.7580 \| 1.0 \| 13.7696 \| 0.7117 \| 0.7017 \| 0.7119 \| 0.0 \| 0.7114 \| 0.7111 \|
	\| 0.1986 \| 11.0 \| 3234 \| 0.8660 \| 1.0 \| 13.5540 \| 0.7390 \| 0.7105 \| 0.7393 \| 0.0 \| 0.7390 \| 0.7384 \|
	\| 0.162 \| 12.0 \| 3528 \| 0.9388 \| 1.0 \| 13.8599 \| 0.7215 \| 0.7022 \| 0.7218 \| 0.0 \| 0.7209 \| 0.7209 \|


	### Framework versions

	- Transformers 4.57.0
	- Pytorch 2.8.0+cu128
	- Datasets 4.3.0
	- Tokenizers 0.22.1