BoggersTheFish's picture
Phase 2 TS-native model — 13.5M params, open-web-math, val PPL 86.50
f52c9f1 verified
raw
history blame contribute delete
273 Bytes
{
"arch": "tension",
"vocab_size": 32768,
"dim": 256,
"num_layers": 6,
"num_heads": 4,
"window": 32,
"ffn_mult": 3,
"max_seq_len": 256,
"dropout": 0.1,
"use_grad_checkpoint": false,
"use_oscillation": true,
"use_rope": false,
"use_triton": false
}