vicclab
/

FolkGPT

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

FolkGPT / README.md

vicclab's picture

Initial card updates.

1844e03 almost 3 years ago

|

history blame contribute delete

2.8 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	model-index:
	- name: FolkGPT
	results: []
	datasets:
	- vicclab/fairy_tales
	language:
	- en
	pipeline_tag: text-generation
	---

	# FolkGPT

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on vicclab/fairy_tales dataset.

	## Model description

	This model is the result of fine-tuning gpt2 on a dataset of fairy tales from various cultures.

	## Intended uses & limitations

	The idea behind this is to generate text in the fashion of fairy tales written in the 18th and 19th centuries.

	Why? Fairy tales seemed an appropriate application for text generation, as stories are usually short(ish),
	self-contained, and easy to read.

	## Training and evaluation data

	Trained on the vicclab/fairy_tales dataset. The dataset consists of a number of texts which
	were downloaded from Project Gutenberg, and then edited to remove all text except for the
	stories themselves. These were then all concatenated into a text file and pushed to HF at
	https://huggingface.co/datasets/vicclab/fairy_tales. The latest update to the dataset, which
	was used in the training of this model, was created and uploaded on February 26th, 2023.
	Texts used [and token count after removing boilerplate text]:
	https://www.gutenberg.org/files/2591/2591-0.txt [102927 tokens]
	https://www.gutenberg.org/files/503/503-0.txt [138353 tokens]
	https://www.gutenberg.org/cache/epub/69739/pg69739.txt [51035 tokens]
	https://www.gutenberg.org/files/2435/2435-0.txt [98791 tokens]
	https://www.gutenberg.org/cache/epub/7871/pg7871.txt [49410 tokens]
	https://www.gutenberg.org/files/8933/8933-0.txt [178622 tokens]
	gutenberg.org/cache/epub/30834/pg30834.txt [58359 tokens]
	https://www.gutenberg.org/cache/epub/68589/pg68589.txt [39815 tokens]
	https://www.gutenberg.org/cache/epub/34453/pg34453.txt [69365 tokens]
	gutenberg.org/cache/epub/8653/pg8653.txt [35351]

	[Total tokens in actual dataset: 1002654 tokens]


	## Training procedure
	The dataset was loaded, sampling by paragraph. From here, the dataset was split into a training dataset
	and a validation dataset in an 80-20 split. These were then tokenized. The model was set up, and the trainer
	was instantiated with the training_arguments listed below. Then, the training took place.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 1
	- mixed_precision_training: Native AMP

	### Training results



	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1+cu116
	- Datasets 2.10.0
	- Tokenizers 0.13.2