Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -10,4 +10,4 @@ Impressive performance for its size, however due to the small size, the model is
 ### Things to note
-This model was trained at 12 epoch (which I thought 6 sufficient but I guess more is better for models < 3B) at 1e-4 learning rate with batch size of 8. One insight I found is that the model (raw safetensors, not quantised), performs very well at 12 epochs albeit having some flaws due to dataset limitation and model capacity, so Im going to stick to this settings in future training


10
11
12	### Things to note
13	+ This model was trained at 12 epoch (which I thought 6 was sufficient but I guess more is better for models < 3B) at 1e-4 learning rate with batch size of 8. One insight I found is that the model (raw safetensors, not quantised) performs very well at 12 epochs albeit having some flaws due to dataset limitation and model capacity leading to saturated quality and output, so Im going to stick to this settings in future training