TokForge Acceleration Pack โ Qwen3.5 Draft (Deprecated)
Deprecated: this
Qwen3.5-0.8Bdraft bundle is preserved for reproducibility, but the newerQwen3-0.6BTokForge draft line is the practical default.
Why it is deprecated
Internal fleet testing showed that the smaller Qwen3-0.6B draft family was the better practical draft lane across our mobile tests.
Representative comparison from our internal notes:
| Target | 0.6B Draft | 0.8B Draft |
|---|---|---|
| Qwen3.5-9B | +59% | +16% |
| Qwen3-8B | +57% | -41% (wrong arch) |
| Qwen3-14B | +67% | -32% (wrong arch) |
Use instead
- Recommended replacement: TokForge-AccelerationPack-Draft
- Collection: TokForge Mobile Draft Models
Performance Notes
This bundle is preserved because it was an important research step, but it is not the current practical winner. In our internal mobile testing, the newer Qwen3-0.6B draft family consistently provided better speedups and better architecture pairing.
What is included
llm.mnnllm.mnn.weightllm_config.json- tokenizer file(s)
Usage
This repo is for TokForge / MNN users who specifically want to reproduce the older Qwen3.5-0.8B draft path.
Limitations and Intended Use
- Deprecated for normal use.
- Cross-family drafting was weaker than the later same-family
Qwen3-0.6Bdraft lane. - Keep using this only if you need reproducibility for old experiments.
TokForge
- Website: tokforge.ai
- Discord: Join the Discord
- Downloads last month
- 249