File size: 1,813 Bytes
2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 0871ce8 2ad89b9 f0f5097 2ad89b9 f0f5097 0871ce8 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 f0f5097 2ad89b9 ea1cf6d f0f5097 ea1cf6d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ---
library_name: transformers
license: apache-2.0
---
# 🧠 JAT-GPT: Just Another Tiny GPT
Welcome to **JAT-GPT**, the world's most underwhelming large language model — clocking in at a mighty **17.9 million parameters** (yes, million, not billion — stop laughing).
## 📦 Model Details
- **Model type**: GPT2-based decoder-only transformer
- **Architecture**: GPT-2
- **Library**: Hugging Face 🤗 Transformers
- **Parameters**: 17.9 million (size isn't everything... right?)
- **Training Objective**: Learn to predict the next word — and sometimes even the *right* one!
- **Pretrained on**: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
- **Training Purpose**: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.
## 🚀 Capabilities
- Can generate small sentences
- "Please lower your expectations."
- Can hallucinate confidently, but in a very short and polite way.
- Can generate random words after few tokens.
## 🙅 Limitations
- Not very smart.
- Only Pretrained.
- Understands context... if it fits within few tokens.
- Cannot replace ChatGPT. (But look how cute it is!)
## 🤷 Why Train This?
> "Because I could." – :-)
- To understand the internals of language modeling.
- To cry less when training real models later.
- To appreciate just how powerful modern LLMs are by comparison.
## 🛠️ Usage
```python
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT")
input_ids = tokenizer.encode("Once upon a time", return_tensors="pt")
output = model.generate(input_ids, max_length=20, do_sample=True)
print(tokenizer.decode(output[0]))
```
|