moelanoby commited on
Commit
59e3ed2
·
verified ·
1 Parent(s): 5168ee0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -5,4 +5,10 @@ library_name: transformers
5
 
6
  this is the generative pretrained model (GPT) version of the model :D
7
 
8
- it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
 
 
 
 
 
 
 
5
 
6
  this is the generative pretrained model (GPT) version of the model :D
7
 
8
+ it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
9
+
10
+ either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc...
11
+
12
+ BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures
13
+
14
+ and I might make roleplaying AI models so stay tuned for that :3