JAT-GPT / README.md
itsme-nishanth's picture
Update README.md
0871ce8 verified
---
library_name: transformers
license: apache-2.0
---
# 🧠 JAT-GPT: Just Another Tiny GPT
Welcome to **JAT-GPT**, the world's most underwhelming large language model β€” clocking in at a mighty **17.9 million parameters** (yes, million, not billion β€” stop laughing).
## πŸ“¦ Model Details
- **Model type**: GPT2-based decoder-only transformer
- **Architecture**: GPT-2
- **Library**: Hugging Face πŸ€— Transformers
- **Parameters**: 17.9 million (size isn't everything... right?)
- **Training Objective**: Learn to predict the next word β€” and sometimes even the *right* one!
- **Pretrained on**: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
- **Training Purpose**: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.
## πŸš€ Capabilities
- Can generate small sentences
- "Please lower your expectations."
- Can hallucinate confidently, but in a very short and polite way.
- Can generate random words after few tokens.
## πŸ™… Limitations
- Not very smart.
- Only Pretrained.
- Understands context... if it fits within few tokens.
- Cannot replace ChatGPT. (But look how cute it is!)
## 🀷 Why Train This?
> "Because I could." – :-)
- To understand the internals of language modeling.
- To cry less when training real models later.
- To appreciate just how powerful modern LLMs are by comparison.
## πŸ› οΈ Usage
```python
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT")
input_ids = tokenizer.encode("Once upon a time", return_tensors="pt")
output = model.generate(input_ids, max_length=20, do_sample=True)
print(tokenizer.decode(output[0]))
```