BestTerm-440M Checkpoint

Tentative checkpoint for BestTerm-440M.

This is a global parameter-vector SLERP between the released N8Programs/NextTerm-440M base model and the hot b-file-only continued-pretraining checkpoint at 500M tokens, with interpolation t=0.80.

Quick Scores

Ryskina & Knight sequence completion, exact next-term accuracy:

  • Greedy, old PyTorch/Transformers sanity path: 38/57 (66.67%).
  • Beam search, num_beams=4, comma stop and EOS/PAD suppressed: 40/57 (70.18%).

Other preservation checks from the same sweep:

  • OEIS-Eval-Neo: 6532/19034 (34.318%).
  • M1 Competition 111 macro MAPE: 17.582548.
  • Polynomial continuation: arithmetic 94.5625%, quadratic 86.3043%, cubic 74.5682%, quartic 67.9524%.

Notes

This checkpoint is tentative and was selected as the aggressive Ryskina/M1 point on the base-to-hot500 SLERP sweep. The more conservative preservation point was t=0.60.

Downloads last month
-
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for N8Programs/BestTerm-440M-Checkpts

Finetuned
(1)
this model