TokForge Acceleration Pack — Qwen3.5 Draft (Deprecated)

Deprecated: this Qwen3.5-0.8B draft bundle is preserved for reproducibility, but the newer Qwen3-0.6B TokForge draft line is the practical default.

Why it is deprecated

Internal fleet testing showed that the smaller Qwen3-0.6B draft family was the better practical draft lane across our mobile tests.

Representative comparison from our internal notes:

Target	0.6B Draft	0.8B Draft
Qwen3.5-9B	+59%	+16%
Qwen3-8B	+57%	-41% (wrong arch)
Qwen3-14B	+67%	-32% (wrong arch)

Use instead

Recommended replacement: TokForge-AccelerationPack-Draft
Collection: TokForge Mobile Draft Models

Performance Notes

This bundle is preserved because it was an important research step, but it is not the current practical winner. In our internal mobile testing, the newer Qwen3-0.6B draft family consistently provided better speedups and better architecture pairing.

What is included

llm.mnn
llm.mnn.weight
llm_config.json
tokenizer file(s)

Usage

This repo is for TokForge / MNN users who specifically want to reproduce the older Qwen3.5-0.8B draft path.

Limitations and Intended Use

Deprecated for normal use.
Cross-family drafting was weaker than the later same-family Qwen3-0.6B draft lane.
Keep using this only if you need reproducibility for old experiments.

TokForge

Website: tokforge.ai
Discord: Join the Discord

Downloads last month: 8

Model tree for darkmaniac7/TokForge-AccelerationPack-Qwen35-Draft

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B

Finetuned

(246)

this model