|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- text-generation-inference |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
 |
|
|
|
|
|
## GPT-Usenet |
|
|
An 81-million parameter LLM using GPT-2 encodings. |
|
|
Trained using 10GB of USENET posts along with over 1 GB of miscellaneous BBS posts, digitized books, and text documents. |
|
|
Supervised fine-tuning should be performed before use. |
|
|
|
|
|
## Purpose of GPT-Usenet |
|
|
LLMs are all currently focused on becoming larger and larger, able to do more and more. However, this just makes them jack of all trades, master of none. GPT-Usenet takes a different approach. Instead of trying to do everything perfectly, GPT-Usenet offers a digital stem cell, which can then be finetuned into a single, specialized role and run in parallel with copies of itself. |
|
|
|
|
|
## Technical Information |
|
|
| | | |
|
|
|---------------------------------|----:| |
|
|
|Layers |10| |
|
|
|Heads |10| |
|
|
|Embeddings |640| |
|
|
|Context Window |1024 tokens| |
|
|
|Tokenizer |GPT-2 BPE| |
|
|
|
|
|
|
|
|
## Training Information |
|
|
| | | |
|
|
|---------------------------------|----:| |
|
|
|Training Loss |2.3256| |
|
|
|Validation Loss |2.3651| |
|
|
|Device |Google Colab L4| |
|
|
|Training Time |16 Hours| |
|
|
|
|
|
|
|
|
## Example Syntax |
|
|
|
|
|
| | | |
|
|
|---------------------------------|----:| |
|
|
|uucp:|The path of reasoning you want GPT-Usenet to use when thinking. Use lowercase words separated by exclamation points.| |
|
|
|Internet:|The system calls relevant to this email| |
|
|
|Path:|The path of reasoning you want GPT-Usenet to use when writing. Use lowercase words separated by exclamation points.| |
|
|
|From:|The username who sent this message| |
|
|
|Sender:|The group that username belongs to| |
|
|
|Newsgroups:|The broad subject field of the email.| |
|
|
|Subject:|The prompt| |
|
|
|Message-ID:|The type of message this is.| |
|
|
|Date:|Use this field to simulate urgency or moods.| |
|
|
|Organization:|The system GPT-Usenet is running on.(testing... deployment... simulation)| |
|
|
|Lines:|How long the message is.| |
|
|
|Write the SFT response here. First, Prefix the first sentence with > to signify that it is a Reasoning sentence.|| |
|
|
|--|The stop tokens| |
|
|
|
|
|
``` |
|
|
uucp:!field1!field2! |
|
|
Internet:simulation |
|
|
Path:!field1!field2! |
|
|
From:user |
|
|
Sender:usergroup |
|
|
Newsgroups:motorskills.papercraft |
|
|
Subject:Build a paper airplane |
|
|
Message-ID:Command |
|
|
Date:01 Jan 01 00:00:01 GMT |
|
|
Organization:deployment |
|
|
Lines: 1 |
|
|
|
|
|
>Provide detailed steps on building a paper airplane. |
|
|
|
|
|
-- |
|
|
``` |
|
|
|
|
|
For finetuning, your data should be in the .mbox format. |