Feature Extraction
Transformers
Safetensors
qwen3
speculative-decoding
dflash
eagle
draft-model
kimi-k2
specforge
custom_code
Instructions to use cm00cm/Kimi-K2.7-Code-DFlash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cm00cm/Kimi-K2.7-Code-DFlash with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="cm00cm/Kimi-K2.7-Code-DFlash", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("cm00cm/Kimi-K2.7-Code-DFlash", trust_remote_code=True) model = AutoModel.from_pretrained("cm00cm/Kimi-K2.7-Code-DFlash", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
| license: other | |
| base_model: moonshotai/Kimi-K2.7-Code | |
| library_name: transformers | |
| tags: | |
| - speculative-decoding | |
| - dflash | |
| - eagle | |
| - draft-model | |
| - kimi-k2 | |
| - specforge | |
| # Kimi-K2.7-Code DFlash draft | |
| DFlash speculative-decoding **draft model** for [moonshotai/Kimi-K2.7-Code](https://huggingface.co/moonshotai/Kimi-K2.7-Code), | |
| trained with [SpecForge](https://github.com/sgl-project/SpecForge) (PR #593) on NVIDIA Nemotron-Post-Training-Dataset-v2 (stem+chat+math+code). | |
| - 6-layer Qwen3-style draft (hidden 7168); consumes target hidden states at layers [1,12,24,35,47,58]; block_size 8. | |
| - Target vocab/tokenizer: Kimi-K2.7-Code (vocab 163840, mask_token_id 163838). | |
| - Checkpoint: epoch_4_step_334000 — **Work-in-progress snapshot (epoch_4_step_334000)** — training still running. | |
| Load with `trust_remote_code=True` (model code in `dflash.py`). Intended as the draft in SGLang DFlash speculative decoding paired with the Kimi-K2.7-Code target. | |