basho_haiku_gpt2

Generate a Haiku in the style of Matsuo Bashō given an initial word prompt!

This model is a fine-tuned version of gpt2 on a Haiku dataset. It achieves the following results on the evaluation set:

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
0.9095	0.11	100	0.6605
0.6638	0.22	200	0.6303
0.6456	0.33	300	0.6163
0.6421	0.45	400	0.6139
0.6421	0.56	500	0.6069
0.6182	0.67	600	0.5944
0.6277	0.78	700	0.5910
0.6409	0.89	800	0.5878
0.6047	1.0	900	0.5807
0.4944	1.11	1000	0.5847
0.4878	1.23	1100	0.5897
0.4706	1.34	1200	0.5947
0.4829	1.45	1300	0.5866
0.4742	1.56	1400	0.5875
0.4555	1.67	1500	0.5884
0.4713	1.78	1600	0.5890
0.4669	1.9	1700	0.5848
0.475	2.01	1800	0.5838
0.3762	2.12	1900	0.6123
0.3703	2.23	2000	0.6172
0.3772	2.34	2100	0.6118
0.3731	2.45	2200	0.6090
0.3662	2.56	2300	0.6151
0.3894	2.68	2400	0.6132
0.3663	2.79	2500	0.6195
0.368	2.9	2600	0.6163
0.3735	3.01	2700	0.6191
0.3006	3.12	2800	0.6518
0.3071	3.23	2900	0.6603
0.2898	3.34	3000	0.6629
0.2986	3.46	3100	0.6648
0.3107	3.57	3200	0.6558
0.3064	3.68	3300	0.6568
0.3052	3.79	3400	0.6633
0.3069	3.9	3500	0.6626
0.2872	4.01	3600	0.6641
0.2711	4.12	3700	0.6848
0.2584	4.24	3800	0.6944
0.2606	4.35	3900	0.7007
0.2538	4.46	4000	0.7029
0.2481	4.57	4100	0.7014
0.2466	4.68	4200	0.7006
0.25	4.79	4300	0.6990
0.2568	4.91	4400	0.7002

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for JulianS/basho_haiku_gpt2

Base model

Finetuned

this model