N8Programs's picture
Add tentative BestTerm-440M model card
38d3937 verified
|
Raw
History Blame Contribute Delete
1.06 kB
---
library_name: transformers
base_model: N8Programs/NextTerm-440M
tags:
- qwen3
- oeis
- sequence-modeling
- slerp
---
# BestTerm-440M Checkpoint
Tentative checkpoint for `BestTerm-440M`.
This is a global parameter-vector SLERP between the released `N8Programs/NextTerm-440M` base model and the hot b-file-only continued-pretraining checkpoint at 500M tokens, with interpolation `t=0.80`.
## Quick Scores
Ryskina & Knight sequence completion, exact next-term accuracy:
- Greedy, old PyTorch/Transformers sanity path: `38/57` (`66.67%`).
- Beam search, `num_beams=4`, comma stop and EOS/PAD suppressed: `40/57` (`70.18%`).
Other preservation checks from the same sweep:
- OEIS-Eval-Neo: `6532/19034` (`34.318%`).
- M1 Competition 111 macro MAPE: `17.582548`.
- Polynomial continuation: arithmetic `94.5625%`, quadratic `86.3043%`, cubic `74.5682%`, quartic `67.9524%`.
## Notes
This checkpoint is tentative and was selected as the aggressive Ryskina/M1 point on the base-to-hot500 SLERP sweep. The more conservative preservation point was `t=0.60`.