platz109M
A 109-million-parameter causal language model trained from scratch on open-webtext using only free compute (Google Colab + Kaggle).
The project is a tribute to Tom Platz’s “no-excuses” mindset: we are not prisoners of our hardware (or genetics)—we squeeze every rep out of what we have.
Owing to hardware limitations, the model was trained only on a fraction of OpenWebText (around 513M tokens), but the architecture allowed it to gain generative capabilites and grammatically coherent sentences, while being able to maintain some context.
This was undertaken as an educational project demonstrating proof of concept.
Quick start
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForCausalLM.from_pretrained("YOUR_HF_NAME/platz109M")
prompt = "Leg workouts are very important"
inputs = tokenizer(prompt, return_tensors="pt")
out = model.generate(**inputs, max_new_tokens=50, do_sample=True, top_p=0.92)
print(tokenizer.decode(out[0], skip_special_tokens=True))
- Downloads last month
- 1