itsme-nishanth
/

JAT-GPT

Text Generation

text-generation-inference

Model card Files Files and versions

JAT-GPT / README.md

itsme-nishanth's picture

Update README.md

0871ce8 verified 9 months ago

|

history blame contribute delete

1.81 kB

	---
	library_name: transformers
	license: apache-2.0
	---
	# 🧠 JAT-GPT: Just Another Tiny GPT

	Welcome to JAT-GPT, the world's most underwhelming large language model — clocking in at a mighty 17.9 million parameters (yes, million, not billion — stop laughing).

	## 📦 Model Details

	- Model type: GPT2-based decoder-only transformer
	- Architecture: GPT-2
	- Library: Hugging Face 🤗 Transformers
	- Parameters: 17.9 million (size isn't everything... right?)
	- Training Objective: Learn to predict the next word — and sometimes even the right one!
	- Pretrained on: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
	- Training Purpose: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.

	## 🚀 Capabilities

	- Can generate small sentences
	- "Please lower your expectations."
	- Can hallucinate confidently, but in a very short and polite way.
	- Can generate random words after few tokens.

	## 🙅 Limitations

	- Not very smart.
	- Only Pretrained.
	- Understands context... if it fits within few tokens.
	- Cannot replace ChatGPT. (But look how cute it is!)

	## 🤷 Why Train This?

	> "Because I could." – :-)

	- To understand the internals of language modeling.
	- To cry less when training real models later.
	- To appreciate just how powerful modern LLMs are by comparison.

	## 🛠️ Usage

	```python
	from transformers import GPT2Tokenizer, GPT2LMHeadModel

	tokenizer = GPT2Tokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
	model = GPT2LMHeadModel.from_pretrained("itsme-nishanth/JAT-GPT")

	input_ids = tokenizer.encode("Once upon a time", return_tensors="pt")
	output = model.generate(input_ids, max_length=20, do_sample=True)
	print(tokenizer.decode(output[0]))
	```