Kimi-K2.7-Code-MLX-4bit-hiprec

MLX (Apple Silicon) conversion of moonshotai/Kimi-K2.7-Code — a ~1T-parameter (32B active) DeepSeek-V3-style MoE coding model. Text-only build (vision tower dropped during conversion).

What "hiprec" means

The source checkpoint is natively 4-bit: its routed experts ship as compressed-tensors int4 (group size 32), while the attention, shared/dense MLPs and lm_head are left in bf16. The experts (the ~95% bulk) therefore cannot be made higher-precision — there is no 5/6/8-bit version of this model to convert from.

This build keeps the experts at their native 4-bit and quantizes the otherwise-bf16 layers to 6-bit (group 64) instead of crushing them to 4-bit. Net effect (~5.0 bits/weight, ~600 GB): higher fidelity on attention/router/dense/lm_head than a uniform-4-bit MLX build, at a small size premium.

If you want the smallest footprint instead, the uniform sub-4-bit community builds (inferencerlabs 3.5-bit, spicyneuron 3.6-bit, mlx-community 4-bit) are the alternatives.

Requirements

~600 GB on disk and roughly 768 GB+ of unified memory to run (it does not fit a 512 GB machine).

Use with mlx-lm

pip install mlx-lm
python -m mlx_lm generate --model pipenetwork/Kimi-K2.7-Code-MLX-4bit-hiprec --trust-remote-code --prompt "Write a Python LRU cache." -m 512

Validation

Not run-tested by the publisher — the model exceeds the conversion host's RAM. Verified by file-integrity check (weight index, shard presence, config, tokenizer) only.

License

Released under the Kimi K2 license (see LICENSE). Quantization config (excerpt): bits=4, group_size=32 for experts; non-expert layers at 6-bit/group-64.

Downloads last month
829
Safetensors
Model size
1T params
Tensor type
BF16
·
U32
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pipenetwork/Kimi-K2.7-Code-MLX-4bit-hiprec

Quantized
(14)
this model

Collection including pipenetwork/Kimi-K2.7-Code-MLX-4bit-hiprec