🧠 JAT-GPT: Just Another Tiny GPT

Welcome to JAT-GPT, the world's most underwhelming large language model β€” clocking in at a mighty 17.9 million parameters (yes, million, not billion β€” stop laughing).

πŸ“¦ Model Details

  • Model type: GPT2-based decoder-only transformer
  • Architecture: GPT-2
  • Library: Hugging Face πŸ€— Transformers
  • Parameters: 17.9 million (size isn't everything... right?)
  • Training Objective: Learn to predict the next word β€” and sometimes even the right one!
  • Pretrained on: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
  • Training Purpose: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.

πŸš€ Capabilities

  • Can generate small sentences
    • "Please lower your expectations."
  • Can hallucinate confidently, but in a very short and polite way.
  • Can generate random words after few tokens.

πŸ™… Limitations

  • Not very smart.
  • Only Pretrained.
  • Understands context... if it fits within few tokens.
  • Cannot replace ChatGPT. (But look how cute it is!)

🀷 Why Train This?

"Because I could." – :-)

  • To understand the internals of language modeling.
  • To cry less when training real models later.
  • To appreciate just how powerful modern LLMs are by comparison.

πŸ› οΈ Usage

from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT")

input_ids = tokenizer.encode("Once upon a time", return_tensors="pt")
output = model.generate(input_ids, max_length=20, do_sample=True)
print(tokenizer.decode(output[0]))
Downloads last month
64
Safetensors
Model size
17.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for itsme-nishanth/JAT-GPT

Quantizations
1 model