darkmaniac7
/

TokForge-AccelerationPack-Draft

Text Generation

speculative-decoding

Model card Files Files and versions

TokForge-AccelerationPack-Draft

339 MB

Ctrl+K

Ctrl+K

1 contributor

History: 15 commits

darkmaniac7's picture

Add TokForge app links

3c53adc verified 3 days ago

.gitattributes

1.61 kB
Upload folder using huggingface_hub 4 months ago
README.md

5.27 kB
Add TokForge app links 3 days ago
config.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
config_opencl.json

172 Bytes
Add config_opencl.json for OpenCL draft backend support 4 months ago
draft_config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
llm.mnn

504 kB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
llm.mnn.json

1.01 MB
Upload llm.mnn.json with huggingface_hub 3 months ago
llm.mnn.weight

336 MB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
llm_config.json

4.66 kB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago
runtime_config.json

1.36 kB
Add runtime_config.json with optimal spec decode settings 4 months ago
tokenizer.txt

1.61 MB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 4 months ago