File size: 1,813 Bytes
2ad89b9
 
f0f5097
2ad89b9
f0f5097
2ad89b9
0871ce8
2ad89b9
f0f5097
2ad89b9
f0f5097
 
 
0871ce8
f0f5097
 
 
2ad89b9
f0f5097
2ad89b9
f0f5097
 
 
 
2ad89b9
f0f5097
2ad89b9
f0f5097
 
 
 
2ad89b9
f0f5097
2ad89b9
f0f5097
2ad89b9
f0f5097
 
 
2ad89b9
f0f5097
2ad89b9
f0f5097
 
2ad89b9
f0f5097
 
2ad89b9
ea1cf6d
f0f5097
ea1cf6d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
library_name: transformers
license: apache-2.0
---
# 🧠 JAT-GPT: Just Another Tiny GPT

Welcome to **JAT-GPT**, the world's most underwhelming large language model — clocking in at a mighty **17.9 million parameters** (yes, million, not billion — stop laughing). 

## 📦 Model Details

- **Model type**: GPT2-based decoder-only transformer
- **Architecture**: GPT-2
- **Library**: Hugging Face 🤗 Transformers
- **Parameters**: 17.9 million (size isn't everything... right?)
- **Training Objective**: Learn to predict the next word — and sometimes even the *right* one!
- **Pretrained on**: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
- **Training Purpose**: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.

## 🚀 Capabilities

- Can generate small sentences
  - "Please lower your expectations."
- Can hallucinate confidently, but in a very short and polite way.
- Can generate random words after few tokens.

## 🙅 Limitations

- Not very smart.
- Only Pretrained.
- Understands context... if it fits within few tokens.
- Cannot replace ChatGPT. (But look how cute it is!)

## 🤷 Why Train This?

> "Because I could." – :-)

- To understand the internals of language modeling.
- To cry less when training real models later.
- To appreciate just how powerful modern LLMs are by comparison.

## 🛠️ Usage

```python
from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT")

input_ids = tokenizer.encode("Once upon a time", return_tensors="pt")
output = model.generate(input_ids, max_length=20, do_sample=True)
print(tokenizer.decode(output[0]))
```