| --- |
| library_name: transformers |
| license: apache-2.0 |
| --- |
| # π§ JAT-GPT: Just Another Tiny GPT |
|
|
| Welcome to **JAT-GPT**, the world's most underwhelming large language model β clocking in at a mighty **17.9 million parameters** (yes, million, not billion β stop laughing). |
|
|
| ## π¦ Model Details |
|
|
| - **Model type**: GPT2-based decoder-only transformer |
| - **Architecture**: GPT-2 |
| - **Library**: Hugging Face π€ Transformers |
| - **Parameters**: 17.9 million (size isn't everything... right?) |
| - **Training Objective**: Learn to predict the next word β and sometimes even the *right* one! |
| - **Pretrained on**: A secret* dataset (*"secret" means the dataset was just some text I could find lying around) |
| - **Training Purpose**: Solely educational. Also for flexing on friends who havenβt trained a language model from scratch. |
| |
| ## π Capabilities |
| |
| - Can generate small sentences |
| - "Please lower your expectations." |
| - Can hallucinate confidently, but in a very short and polite way. |
| - Can generate random words after few tokens. |
| |
| ## π
Limitations |
| |
| - Not very smart. |
| - Only Pretrained. |
| - Understands context... if it fits within few tokens. |
| - Cannot replace ChatGPT. (But look how cute it is!) |
| |
| ## π€· Why Train This? |
| |
| > "Because I could." β :-) |
| |
| - To understand the internals of language modeling. |
| - To cry less when training real models later. |
| - To appreciate just how powerful modern LLMs are by comparison. |
| |
| ## π οΈ Usage |
| |
| ```python |
| from transformers import GPT2Tokenizer, GPT2LMHeadModel |
| |
| tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT") |
| model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT") |
| |
| input_ids = tokenizer.encode("Once upon a time", return_tensors="pt") |
| output = model.generate(input_ids, max_length=20, do_sample=True) |
| print(tokenizer.decode(output[0])) |
| ``` |
| |