Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

darkmaniac7
/
TokForge-AccelerationPack-Draft

Text Generation
English
mnn
speculative-decoding
draft-model
qwen3
tokforge
Model card Files Files and versions
xet
Community
TokForge-AccelerationPack-Draft
338 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 13 commits
darkmaniac7's picture
darkmaniac7
Upload README.md with huggingface_hub
994747a verified 9 days ago
  • .gitattributes
    1.61 kB
    Upload folder using huggingface_hub 16 days ago
  • README.md
    5 kB
    Upload README.md with huggingface_hub 9 days ago
  • config.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • config_cpu.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • config_opencl.json
    172 Bytes
    Add config_opencl.json for OpenCL draft backend support 12 days ago
  • draft_config_cpu.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • llm.mnn
    504 kB
    xet
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • llm.mnn.weight
    336 MB
    xet
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • llm_config.json
    4.66 kB
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago
  • runtime_config.json
    1.36 kB
    Add runtime_config.json with optimal spec decode settings 11 days ago
  • tokenizer.txt
    1.61 MB
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 12 days ago