HDTenEightyP commited on
Commit
d50201e
·
verified ·
1 Parent(s): a18b346

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -3
README.md CHANGED
@@ -1,3 +1,75 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-generation-inference
7
+ pipeline_tag: text-generation
8
+ ---
9
+
10
+ ![GPTUsenet2](https://cdn-uploads.huggingface.co/production/uploads/64b7618e2f5a966b972e9978/FNEKaeJ3of0W_HQ8x3amo.jpeg)
11
+
12
+ ## GPT-Usenet-2
13
+ An 81-million parameter LLM using GPT-2 encodings.
14
+ Trained using 10GB of USENET posts along with over 1 GB of miscellaneous BBS posts, digitized books, and text documents.
15
+ Supervised fine-tuning should be performed before use.
16
+
17
+ ## Purpose of GPT-Usenet-2
18
+ LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself.
19
+
20
+ ## Technical Information
21
+ | | |
22
+ |---------------------------------|----:|
23
+ |Layers |10|
24
+ |Heads |10|
25
+ |Embeddings |640|
26
+ |Context Window |1024 tokens|
27
+ |Tokenizer |GPT-2 BPE|
28
+
29
+
30
+ ## Training Information
31
+ | | |
32
+ |---------------------------------|----:|
33
+ |Training Loss |2.1863|
34
+ |Validation Loss |2.0754|
35
+ |Device |Google Colab L4, Google Colab A100|
36
+ |Training Time |16 Hours|
37
+
38
+
39
+ ## Example Syntax
40
+
41
+ | | |
42
+ |---------------------------------|----:|
43
+ |uucp:|The path of reasoning you want GPT-Usenet to use when thinking. Use lowercase words separated by exclamation points.|
44
+ |Internet:|The system calls relevant to this email|
45
+ |Path:|The path of reasoning you want GPT-Usenet to use when writing. Use lowercase words separated by exclamation points.|
46
+ |From:|The username who sent this message|
47
+ |Sender:|The group that username belongs to|
48
+ |Newsgroups:|The broad subject field of the email.|
49
+ |Subject:|The prompt|
50
+ |Message-ID:|The type of message this is.|
51
+ |Date:|Use this field to simulate urgency or moods.|
52
+ |Organization:|The system GPT-Usenet is running on.(testing... deployment... simulation)|
53
+ |Lines:|How long the message is.|
54
+ |Write the SFT response here. First, Prefix the first sentence with > to signify that it is a Reasoning sentence.||
55
+ |--|The stop tokens|
56
+
57
+ ```
58
+ uucp:!field1!field2!
59
+ Internet:simulation
60
+ Path:!field1!field2!
61
+ From:user
62
+ Sender:usergroup
63
+ Newsgroups:motorskills.papercraft
64
+ Subject:Build a paper airplane
65
+ Message-ID:Command
66
+ Date:01 Jan 01 00:00:01 GMT
67
+ Organization:deployment
68
+ Lines: 1
69
+
70
+ >Provide detailed steps on building a paper airplane.
71
+
72
+ --
73
+ ```
74
+
75
+ For finetuning, your data should be in the .mbox format.