Toflamus
/

Finetuned3

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Finetuned3 / README.md

Toflamus's picture

Update README.md

b768af7 over 2 years ago

|

history blame contribute delete

2.04 kB

	---
	license: mit
	base_model: Toflamus/GPT-2_para3M
	tags:
	- generated_from_trainer
	model-index:
	- name: Output
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Output

	This model is a fine-tuned version of [Toflamus/GPT-2_para3M](https://huggingface.co/Toflamus/GPT-2_para3M) on an unknown dataset.
	TrainOutput(global_step=4060, training_loss=6.123095868491187, metrics={'train_runtime': 1435.0504, 'train_samples_per_second': 181.185, 'train_steps_per_second': 2.829, 'total_flos': 96669633527808.0, 'train_loss': 6.123095868491187, 'epoch': 5.0})
	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 100
	- num_epochs: 5

	### Training results
	Step Training Loss
	100 7.737900
	200 7.066700
	300 6.840200
	400 6.686600
	500 6.607700
	600 6.516500
	700 6.449800
	800 6.360400
	900 6.321700
	1000 6.252700
	1100 6.223500
	1200 6.194700
	1300 6.131500
	1400 6.113400
	1500 6.106500
	1600 6.044100
	1700 6.024400
	1800 6.008500
	1900 6.006600
	2000 5.959900
	2100 5.931100
	2200 5.925300
	2300 5.933500
	2400 5.921900
	2500 5.913400
	2600 5.898100
	2700 5.874700
	2800 5.869100
	2900 5.851200
	3000 5.853900
	3100 5.870100
	3200 5.868100
	3300 5.837000
	3400 5.845300
	3500 5.828800
	3600 5.847400
	3700 5.858600
	3800 5.853200
	3900 5.836600
	4000 5.849100


	### Framework versions

	- Transformers 4.32.0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.2