platz109M

A 109-million-parameter causal language model trained from scratch on open-webtext using only free compute (Google Colab + Kaggle).
The project is a tribute to Tom Platz’s “no-excuses” mindset: we are not prisoners of our hardware (or genetics)—we squeeze every rep out of what we have. Owing to hardware limitations, the model was trained only on a fraction of OpenWebText (around 513M tokens), but the architecture allowed it to gain generative capabilites and grammatically coherent sentences, while being able to maintain some context. This was undertaken as an educational project demonstrating proof of concept.

Quick start

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForCausalLM.from_pretrained("YOUR_HF_NAME/platz109M")

prompt = "Leg workouts are very important"
inputs = tokenizer(prompt, return_tensors="pt")
out = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_p=0.92)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Downloads last month
1
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support