Kimi-K2.7-Code DFlash draft

DFlash speculative-decoding draft model for moonshotai/Kimi-K2.7-Code, trained with SpecForge (PR #593) on NVIDIA Nemotron-Post-Training-Dataset-v2 (stem+chat+math+code).

  • 6-layer Qwen3-style draft (hidden 7168); consumes target hidden states at layers [1,12,24,35,47,58]; block_size 8.
  • Target vocab/tokenizer: Kimi-K2.7-Code (vocab 163840, mask_token_id 163838).
  • Checkpoint: epoch_1_step_116000 โ€” Work-in-progress snapshot (epoch_1_step_116000) โ€” training still running.

Load with trust_remote_code=True (model code in dflash.py). Intended as the draft in SGLang DFlash speculative decoding paired with the Kimi-K2.7-Code target.

Downloads last month
-
Safetensors
Model size
3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cm00cm/Kimi-K2.7-Code-DFlash

Finetuned
(5)
this model