Update README.md
Browse files
README.md
CHANGED
|
@@ -5,4 +5,10 @@ library_name: transformers
|
|
| 5 |
|
| 6 |
this is the generative pretrained model (GPT) version of the model :D
|
| 7 |
|
| 8 |
-
it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
|
| 6 |
this is the generative pretrained model (GPT) version of the model :D
|
| 7 |
|
| 8 |
+
it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
|
| 9 |
+
|
| 10 |
+
either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc...
|
| 11 |
+
|
| 12 |
+
BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures
|
| 13 |
+
|
| 14 |
+
and I might make roleplaying AI models so stay tuned for that :3
|