darkmaniac7/Qwen3-0.6B-kl-baseline-20k-MNN
Text Generation • Updated • 145
Small MNN draft models and speculative-decoding bundles for TokForge on Android. Includes practical Qwen3 0.6B drafts plus experimental variants.
Note Practical baseline 20K Qwen3 0.6B draft for TokForge + MNN. About +30% on RedMagic 8B in our preserved short-run packet.
Note Acceptance-oriented 20K LK Alpha variant. Similar device uplift band to the KL baseline, useful comparison artifact.
Note Experimental 40K LK Alpha draft. Research comparison artifact rather than the default recommendation.
Note Experimental 14B-paired Qwen3 draft for more target-specific speculative decoding experiments.
Note Experimental Qwen3.5 0.8B draft bundle. Uploaded for reproducibility and research, not the main practical recommendation.