pot-o-22-slim

Ultra-slim GPT-2-style causal LM (~22,200 trainable parameters) for PoT-O path / MML experiments. Pairs with the 22,222-example dataset Tribewarez/synthetic-pot-o-challanges-22-22k (signature param_signature 22.2222).

Weights are random initialization — intended for fine-tuning on PoT-O challenge → optimal_path text (see dataset card). Architecture chosen to land near 22.2k params with a 257-token byte-level tokenizer (256 bytes + <|endoftext|>).

Specs

Architecture GPT2LMHeadModel
vocab_size 257
n_positions 64 (truncate long challenge strings for prefill)
n_embd 24
n_layer 2
n_head 1
n_inner 96
Parameters 22,200

Recreate artifacts

cd pot-o-22-slim
python create_model.py

Push to Hub

pip install transformers huggingface_hub
huggingface-cli login
python upload_model.py

Inference (after fine-tune)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Tribewarez/pot-o-22-slim"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

text = "tensor:shape=[32,64];dtype=float16;target_mml=0.22;ops:matmul,gelu"
inputs = tok(text, return_tensors="pt", max_length=64, truncation=True)
# ... generation

Links

MIT licensed • Tribewarez guild • live beta

Downloads last month
465
Safetensors
Model size
22.2k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Tribewarez/pot-o-22-slim

Collection including Tribewarez/pot-o-22-slim