bensapir
/

pixel-barec-pretrain

masked-auto-encoding

Generated from Trainer

Model card Files Files and versions

pixel-barec-pretrain / README.md

bensapir's picture

update model card README.md

cf2d802 7 months ago

|

history blame contribute delete

2.38 kB

	---
	tags:
	- masked-auto-encoding
	- generated_from_trainer
	model-index:
	- name: pixel-barec-pretrain
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# pixel-barec-pretrain

	This model is a fine-tuned version of [bensapir/pixel-barec-pretrain](https://huggingface.co/bensapir/pixel-barec-pretrain) on the wikipedia + bookcorpus dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6179

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 9.375e-06
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.5
	- training_steps: 200000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:------:\|:---------------:\|
	\| 0.8164 \| 11.19 \| 10000 \| 0.7569 \|
	\| 0.7702 \| 22.37 \| 20000 \| 0.7498 \|
	\| 0.7668 \| 33.56 \| 30000 \| 0.7477 \|
	\| 0.7655 \| 44.74 \| 40000 \| 0.7451 \|
	\| 0.7653 \| 27.98 \| 50000 \| 0.7479 \|
	\| 0.7648 \| 33.58 \| 60000 \| 0.7448 \|
	\| 0.7645 \| 39.17 \| 70000 \| 0.7464 \|
	\| 0.7642 \| 44.77 \| 80000 \| 0.7450 \|
	\| 0.7636 \| 50.36 \| 90000 \| 0.7427 \|
	\| 0.7602 \| 55.96 \| 100000 \| 0.7262 \|
	\| 0.7279 \| 61.56 \| 110000 \| 0.6972 \|
	\| 0.6981 \| 67.15 \| 120000 \| 0.6809 \|
	\| 0.6781 \| 72.75 \| 130000 \| 0.6643 \|
	\| 0.6612 \| 78.34 \| 140000 \| 0.6534 \|
	\| 0.6483 \| 83.94 \| 150000 \| 0.6426 \|
	\| 0.6389 \| 89.54 \| 160000 \| 0.6357 \|
	\| 0.6318 \| 95.13 \| 170000 \| 0.6320 \|
	\| 0.6261 \| 100.73 \| 180000 \| 0.6280 \|
	\| 0.6214 \| 106.32 \| 190000 \| 0.6200 \|
	\| 0.6177 \| 111.92 \| 200000 \| 0.6200 \|


	### Framework versions

	- Transformers 4.17.0
	- Pytorch 2.5.1
	- Datasets 2.1.1.dev0
	- Tokenizers 0.21.1