pot-o-22-slim

Ultra-slim GPT-2-style causal LM (~22,200 trainable parameters) for PoT-O path / MML experiments. Pairs with the 22,222-example dataset Tribewarez/synthetic-pot-o-challanges-22-22k (signature param_signature 22.2222).

Weights are random initialization — intended for fine-tuning on PoT-O challenge → optimal_path text (see dataset card). Architecture chosen to land near 22.2k params with a 257-token byte-level tokenizer (256 bytes + <|endoftext|>).

Specs


Architecture	`GPT2LMHeadModel`
`vocab_size`	257
`n_positions`	64 (truncate long challenge strings for prefill)
`n_embd`	24
`n_layer`	2
`n_head`	1
`n_inner`	96
Parameters	22,200

Recreate artifacts

cd pot-o-22-slim
python create_model.py

Push to Hub

pip install transformers huggingface_hub
huggingface-cli login
python upload_model.py

Inference (after fine-tune)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Tribewarez/pot-o-22-slim"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

text = "tensor:shape=[32,64];dtype=float16;target_mml=0.22;ops:matmul,gelu"
inputs = tok(text, return_tensors="pt", max_length=64, truncation=True)
# ... generation

Tribewarez
/

pot-o-22-slim

pot-o-22-slim

Specs

Recreate artifacts

Push to Hub

Inference (after fine-tune)

Links

Dataset used to train Tribewarez/pot-o-22-slim

Collection including Tribewarez/pot-o-22-slim

synthetic-pot-o-22