EleutherAI
/

pythia-70m

Model card Files Files and versions

Resources

View closed (3)

New architecture: TMT — dynamic graph attention + adaptive depth routing, 29.4 PPL at 48% compute (120M params)

#8 opened 19 days ago by

Add MOT badge

#7 opened 8 months ago by

Adding `safetensors` variant of this model

#6 opened 9 months ago by

what was the training set

#5 opened over 1 year ago by

Prompt Template

#4 opened almost 2 years ago by