HDTenEightyP commited on
Commit
ba1eedd
·
verified ·
1 Parent(s): d774522

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -14
README.md CHANGED
@@ -9,12 +9,12 @@ pipeline_tag: text-generation
9
 
10
  ![GPTUsenet2](https://cdn-uploads.huggingface.co/production/uploads/64b7618e2f5a966b972e9978/FNEKaeJ3of0W_HQ8x3amo.jpeg)
11
 
12
- ## GPT-Usenet
13
  An 81-million parameter LLM using GPT-2 encodings.
14
- Trained using 10GB of USENET posts along with over 1 GB of miscellaneous BBS posts, digitized books, and text documents.
15
  Supervised fine-tuning should be performed before use.
16
 
17
- ## Purpose of GPT-Usenet
18
  LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself.
19
 
20
  ## Technical Information
@@ -23,19 +23,9 @@ LLMs are all currently focused on becoming larger and larger, able to do more an
23
  |Layers |10|
24
  |Heads |10|
25
  |Embeddings |640|
26
- |Context Window |1024 tokens|
27
  |Tokenizer |GPT-2 BPE|
28
 
29
-
30
- ## Training Information
31
- | | |
32
- |---------------------------------|----:|
33
- |Training Loss |2.3256|
34
- |Validation Loss |2.3651|
35
- |Device |Google Colab L4|
36
- |Training Time |16 Hours|
37
-
38
-
39
  ## Example Syntax
40
 
41
  | | |
 
9
 
10
  ![GPTUsenet2](https://cdn-uploads.huggingface.co/production/uploads/64b7618e2f5a966b972e9978/FNEKaeJ3of0W_HQ8x3amo.jpeg)
11
 
12
+ ## GPT-Usenet-3
13
  An 81-million parameter LLM using GPT-2 encodings.
14
+ Trained using 10GB of USENET posts, over 1 GB of miscellaneous BBS posts, digitized books, and text documents, and 1.1 GB of multilingual text.
15
  Supervised fine-tuning should be performed before use.
16
 
17
+ ## Purpose of GPT-Usenet-3
18
  LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself.
19
 
20
  ## Technical Information
 
23
  |Layers |10|
24
  |Heads |10|
25
  |Embeddings |640|
26
+ |Context Window |8192 tokens|
27
  |Tokenizer |GPT-2 BPE|
28
 
 
 
 
 
 
 
 
 
 
 
29
  ## Example Syntax
30
 
31
  | | |