HDTenEightyP
/

GPT-USENET-3

Text Generation

text-generation-inference

Model card Files Files and versions

HDTenEightyP commited on Jan 21

Commit

ba1eedd

·

verified ·

1 Parent(s): d774522

Update README.md

Files changed (1) hide show

README.md +4 -14

README.md CHANGED Viewed

@@ -9,12 +9,12 @@ pipeline_tag: text-generation
 ![GPTUsenet2](https://cdn-uploads.huggingface.co/production/uploads/64b7618e2f5a966b972e9978/FNEKaeJ3of0W_HQ8x3amo.jpeg)
-## GPT-Usenet
 An 81-million parameter LLM using GPT-2 encodings.
-Trained using 10GB of USENET posts along with over 1 GB of miscellaneous BBS posts, digitized books, and text documents.
 Supervised fine-tuning should be performed before use.
-## Purpose of GPT-Usenet
 LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself.
 ## Technical Information
@@ -23,19 +23,9 @@ LLMs are all currently focused on becoming larger and larger, able to do more an
 |Layers                           |10|
 |Heads                            |10|
 |Embeddings                       |640|
-|Context Window                   |1024 tokens|
 |Tokenizer                        |GPT-2 BPE|
-## Training Information
-|                                 |     |
-|---------------------------------|----:|
-|Training Loss                    |2.3256|
-|Validation Loss                  |2.3651|
-|Device                           |Google Colab L4|
-|Training Time                    |16 Hours|
 ## Example Syntax
 |                                 |     |

 ![GPTUsenet2](https://cdn-uploads.huggingface.co/production/uploads/64b7618e2f5a966b972e9978/FNEKaeJ3of0W_HQ8x3amo.jpeg)
+## GPT-Usenet-3
 An 81-million parameter LLM using GPT-2 encodings.
+Trained using 10GB of USENET posts, over 1 GB of miscellaneous BBS posts, digitized books, and text documents, and 1.1 GB of multilingual text.
 Supervised fine-tuning should be performed before use.
+## Purpose of GPT-Usenet-3
 LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself.
 ## Technical Information
 |Layers                           |10|
 |Heads                            |10|
 |Embeddings                       |640|
+|Context Window                   |8192 tokens|
 |Tokenizer                        |GPT-2 BPE|
 ## Example Syntax
 |                                 |     |