moelanoby
/

Kok-GPT

@@ -5,4 +5,10 @@ library_name: transformers
 this is the generative pretrained model (GPT) version of the model :D
-it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM

 this is the generative pretrained model (GPT) version of the model :D
+it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
+either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc...
+BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures
+and I might make roleplaying AI models so stay tuned for that :3