MLX
vision
kimi
exo
aidiffuser's picture
Vision tower extracted from moonshotai/Kimi-K2.7-Code (335 tensors, bf16)
10cebb2 verified
|
Raw
History Blame Contribute Delete
1.65 kB
metadata
license: other
license_name: modified-mit
license_link: https://huggingface.co/moonshotai/Kimi-K2.7-Code/blob/main/LICENSE
base_model: moonshotai/Kimi-K2.7-Code
tags:
  - mlx
  - vision
  - kimi
  - exo

Kimi-K2.7-Code-vision

Vision-only weights (MoonViT tower + multimodal projector) extracted from moonshotai/Kimi-K2.7-Code for use with MLX-based inference stacks such as exo, in the same format as exolabs/Kimi-K2.6-vision.

Contents

  • kimi_k27_vision.safetensors — all 335 vision_tower.* and mm_projector.* tensors from the official repo (shards 63–64), original bfloat16, unmodified.
  • config.json — vision config copied from the official config.json (verified byte-identical to Kimi-K2.6's vision config: 27-layer MoonViT, hidden 1152, patch 14, sd2_tpool merger, projector to 7168).
  • extract_vision_weights.py — the script used to produce this repo, for reproducibility.

Usage with exo

Add a model card for moonshotai/Kimi-K2.7-Code with:

capabilities = ["text", "thinking", "thinking_toggle", "vision"]

[vision]
image_token_id = 163605
model_type = "kimi_vl"
weights_repo = "aidiffuser/Kimi-K2.7-Code-vision"
processor_repo = "moonshotai/Kimi-K2.7-Code"

Tested working: distributed (2× Mac Studio M3 Ultra, tensor parallelism) with the official INT4 text weights, image understanding confirmed.

License

Same Modified MIT license as the source model; these are a subset of the original weights, unmodified. All credit to Moonshot AI.