Instructions to use aisoc/Kimi-K2.7-Code-vision with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use aisoc/Kimi-K2.7-Code-vision with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Kimi-K2.7-Code-vision aisoc/Kimi-K2.7-Code-vision
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Commit ·
20194d8
0
Parent(s):
Duplicate from aidiffuser/Kimi-K2.7-Code-vision
Browse filesCo-authored-by: aidiffuser <aidiffuser@users.noreply.huggingface.co>
- .gitattributes +35 -0
- README.md +50 -0
- config.json +38 -0
- extract_vision_weights.py +60 -0
- kimi_k27_vision.safetensors +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: modified-mit
|
| 4 |
+
license_link: https://huggingface.co/moonshotai/Kimi-K2.7-Code/blob/main/LICENSE
|
| 5 |
+
base_model: moonshotai/Kimi-K2.7-Code
|
| 6 |
+
tags:
|
| 7 |
+
- mlx
|
| 8 |
+
- vision
|
| 9 |
+
- kimi
|
| 10 |
+
- exo
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Kimi-K2.7-Code-vision
|
| 14 |
+
|
| 15 |
+
Vision-only weights (MoonViT tower + multimodal projector) extracted from
|
| 16 |
+
[moonshotai/Kimi-K2.7-Code](https://huggingface.co/moonshotai/Kimi-K2.7-Code)
|
| 17 |
+
for use with MLX-based inference stacks such as [exo](https://github.com/exo-explore/exo),
|
| 18 |
+
in the same format as [exolabs/Kimi-K2.6-vision](https://huggingface.co/exolabs/Kimi-K2.6-vision).
|
| 19 |
+
|
| 20 |
+
## Contents
|
| 21 |
+
|
| 22 |
+
- `kimi_k27_vision.safetensors` — all 335 `vision_tower.*` and `mm_projector.*`
|
| 23 |
+
tensors from the official repo (shards 63–64), original bfloat16, unmodified.
|
| 24 |
+
- `config.json` — vision config copied from the official `config.json`
|
| 25 |
+
(verified byte-identical to Kimi-K2.6's vision config: 27-layer MoonViT,
|
| 26 |
+
hidden 1152, patch 14, `sd2_tpool` merger, projector to 7168).
|
| 27 |
+
- `extract_vision_weights.py` — the script used to produce this repo,
|
| 28 |
+
for reproducibility.
|
| 29 |
+
|
| 30 |
+
## Usage with exo
|
| 31 |
+
|
| 32 |
+
Add a model card for `moonshotai/Kimi-K2.7-Code` with:
|
| 33 |
+
|
| 34 |
+
```toml
|
| 35 |
+
capabilities = ["text", "thinking", "thinking_toggle", "vision"]
|
| 36 |
+
|
| 37 |
+
[vision]
|
| 38 |
+
image_token_id = 163605
|
| 39 |
+
model_type = "kimi_vl"
|
| 40 |
+
weights_repo = "aidiffuser/Kimi-K2.7-Code-vision"
|
| 41 |
+
processor_repo = "moonshotai/Kimi-K2.7-Code"
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
Tested working: distributed (2× Mac Studio M3 Ultra, tensor parallelism) with
|
| 45 |
+
the official INT4 text weights, image understanding confirmed.
|
| 46 |
+
|
| 47 |
+
## License
|
| 48 |
+
|
| 49 |
+
Same Modified MIT license as the source model; these are a subset of the
|
| 50 |
+
original weights, unmodified. All credit to Moonshot AI.
|
config.json
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"source_model": "moonshotai/Kimi-K2.7-Code",
|
| 3 |
+
"component": "vision_tower + mm_projector",
|
| 4 |
+
"description": "Vision-only weights extracted locally from Kimi-K2.7-Code for use with MLX/exo.",
|
| 5 |
+
"vision_config": {
|
| 6 |
+
"_attn_implementation": "flash_attention_2",
|
| 7 |
+
"init_pos_emb_height": 64,
|
| 8 |
+
"init_pos_emb_time": 4,
|
| 9 |
+
"init_pos_emb_width": 64,
|
| 10 |
+
"merge_kernel_size": [
|
| 11 |
+
2,
|
| 12 |
+
2
|
| 13 |
+
],
|
| 14 |
+
"merge_type": "sd2_tpool",
|
| 15 |
+
"mm_hidden_size": 1152,
|
| 16 |
+
"mm_projector_type": "patchmerger",
|
| 17 |
+
"patch_size": 14,
|
| 18 |
+
"pos_emb_type": "divided_fixed",
|
| 19 |
+
"projector_hidden_act": "gelu",
|
| 20 |
+
"projector_ln_eps": 1e-05,
|
| 21 |
+
"text_hidden_size": 7168,
|
| 22 |
+
"video_attn_type": "spatial_temporal",
|
| 23 |
+
"vt_hidden_size": 1152,
|
| 24 |
+
"vt_intermediate_size": 4304,
|
| 25 |
+
"vt_num_attention_heads": 16,
|
| 26 |
+
"vt_num_hidden_layers": 27
|
| 27 |
+
},
|
| 28 |
+
"projector": {
|
| 29 |
+
"type": "PatchMergerMLP",
|
| 30 |
+
"input_dim": 4608,
|
| 31 |
+
"hidden_dim": 4608,
|
| 32 |
+
"output_dim": 7168,
|
| 33 |
+
"pre_norm_eps": 1e-05
|
| 34 |
+
},
|
| 35 |
+
"num_tensors": 335,
|
| 36 |
+
"original_dtype": "bfloat16",
|
| 37 |
+
"media_placeholder_token_id": 163605
|
| 38 |
+
}
|
extract_vision_weights.py
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""Build a local Kimi-K2.7-Code vision tower for exo, mirroring exolabs--Kimi-K2.6-vision.
|
| 3 |
+
|
| 4 |
+
Copies the 335 vision_tower.* / mm_projector.* tensors out of the official
|
| 5 |
+
moonshotai/Kimi-K2.7-Code download (shards 63+64) into a single safetensors
|
| 6 |
+
file, with a config.json adapted from the K2.6 tower. exo's downloader falls
|
| 7 |
+
back to local files when the repo doesn't exist on HF, so no upload is needed.
|
| 8 |
+
|
| 9 |
+
Run with exo's venv python (has safetensors):
|
| 10 |
+
~/exo-next/.venv/bin/python ~/.exo/extract-k27-vision-tower.py
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
import json
|
| 14 |
+
from pathlib import Path
|
| 15 |
+
|
| 16 |
+
from safetensors import safe_open
|
| 17 |
+
from safetensors.numpy import save_file
|
| 18 |
+
|
| 19 |
+
MODELS = Path("/Volumes/LLM/exo-models")
|
| 20 |
+
SRC = MODELS / "moonshotai--Kimi-K2.7-Code"
|
| 21 |
+
K26_TOWER = MODELS / "exolabs--Kimi-K2.6-vision"
|
| 22 |
+
DST = MODELS / "aidiffuser--Kimi-K2.7-Code-vision"
|
| 23 |
+
|
| 24 |
+
SHARDS = ["model-00063-of-000064.safetensors", "model-00064-of-000064.safetensors"]
|
| 25 |
+
PREFIXES = ("vision_tower.", "mm_projector.")
|
| 26 |
+
|
| 27 |
+
for shard in SHARDS:
|
| 28 |
+
if not (SRC / shard).exists():
|
| 29 |
+
raise SystemExit(f"missing {shard} — download not finished yet")
|
| 30 |
+
|
| 31 |
+
tensors = {}
|
| 32 |
+
dtypes = set()
|
| 33 |
+
for shard in SHARDS:
|
| 34 |
+
# framework="np" keeps bf16 unsupported; use torch-free path via mlx instead
|
| 35 |
+
with safe_open(str(SRC / shard), framework="pt") as f:
|
| 36 |
+
for key in f.keys():
|
| 37 |
+
if key.startswith(PREFIXES):
|
| 38 |
+
t = f.get_tensor(key)
|
| 39 |
+
dtypes.add(str(t.dtype))
|
| 40 |
+
tensors[key] = t
|
| 41 |
+
|
| 42 |
+
print(f"extracted {len(tensors)} tensors, dtypes: {dtypes}")
|
| 43 |
+
assert len(tensors) == 335, f"expected 335 tensors, got {len(tensors)}"
|
| 44 |
+
|
| 45 |
+
DST.mkdir(exist_ok=True)
|
| 46 |
+
|
| 47 |
+
# safetensors.numpy can't write bf16; go through torch's save_file instead
|
| 48 |
+
from safetensors.torch import save_file as save_pt
|
| 49 |
+
|
| 50 |
+
save_pt(tensors, str(DST / "kimi_k27_vision.safetensors"))
|
| 51 |
+
|
| 52 |
+
cfg = json.loads((K26_TOWER / "config.json").read_text())
|
| 53 |
+
cfg["source_model"] = "moonshotai/Kimi-K2.7-Code"
|
| 54 |
+
cfg["description"] = (
|
| 55 |
+
"Vision-only weights extracted locally from Kimi-K2.7-Code for use with MLX/exo."
|
| 56 |
+
)
|
| 57 |
+
(DST / "config.json").write_text(json.dumps(cfg, indent=1) + "\n")
|
| 58 |
+
|
| 59 |
+
size = (DST / "kimi_k27_vision.safetensors").stat().st_size
|
| 60 |
+
print(f"wrote {DST} ({size / 1e9:.2f} GB) — done")
|
kimi_k27_vision.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aae39a3d166a65795fb62ad14c20f4b7fd840db209a9a016960afe1db02520bc
|
| 3 |
+
size 942326328
|