MLX
vision
kimi
exo
aisoc aidiffuser commited on
Commit
20194d8
·
0 Parent(s):

Duplicate from aidiffuser/Kimi-K2.7-Code-vision

Browse files

Co-authored-by: aidiffuser <aidiffuser@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: modified-mit
4
+ license_link: https://huggingface.co/moonshotai/Kimi-K2.7-Code/blob/main/LICENSE
5
+ base_model: moonshotai/Kimi-K2.7-Code
6
+ tags:
7
+ - mlx
8
+ - vision
9
+ - kimi
10
+ - exo
11
+ ---
12
+
13
+ # Kimi-K2.7-Code-vision
14
+
15
+ Vision-only weights (MoonViT tower + multimodal projector) extracted from
16
+ [moonshotai/Kimi-K2.7-Code](https://huggingface.co/moonshotai/Kimi-K2.7-Code)
17
+ for use with MLX-based inference stacks such as [exo](https://github.com/exo-explore/exo),
18
+ in the same format as [exolabs/Kimi-K2.6-vision](https://huggingface.co/exolabs/Kimi-K2.6-vision).
19
+
20
+ ## Contents
21
+
22
+ - `kimi_k27_vision.safetensors` — all 335 `vision_tower.*` and `mm_projector.*`
23
+ tensors from the official repo (shards 63–64), original bfloat16, unmodified.
24
+ - `config.json` — vision config copied from the official `config.json`
25
+ (verified byte-identical to Kimi-K2.6's vision config: 27-layer MoonViT,
26
+ hidden 1152, patch 14, `sd2_tpool` merger, projector to 7168).
27
+ - `extract_vision_weights.py` — the script used to produce this repo,
28
+ for reproducibility.
29
+
30
+ ## Usage with exo
31
+
32
+ Add a model card for `moonshotai/Kimi-K2.7-Code` with:
33
+
34
+ ```toml
35
+ capabilities = ["text", "thinking", "thinking_toggle", "vision"]
36
+
37
+ [vision]
38
+ image_token_id = 163605
39
+ model_type = "kimi_vl"
40
+ weights_repo = "aidiffuser/Kimi-K2.7-Code-vision"
41
+ processor_repo = "moonshotai/Kimi-K2.7-Code"
42
+ ```
43
+
44
+ Tested working: distributed (2× Mac Studio M3 Ultra, tensor parallelism) with
45
+ the official INT4 text weights, image understanding confirmed.
46
+
47
+ ## License
48
+
49
+ Same Modified MIT license as the source model; these are a subset of the
50
+ original weights, unmodified. All credit to Moonshot AI.
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "source_model": "moonshotai/Kimi-K2.7-Code",
3
+ "component": "vision_tower + mm_projector",
4
+ "description": "Vision-only weights extracted locally from Kimi-K2.7-Code for use with MLX/exo.",
5
+ "vision_config": {
6
+ "_attn_implementation": "flash_attention_2",
7
+ "init_pos_emb_height": 64,
8
+ "init_pos_emb_time": 4,
9
+ "init_pos_emb_width": 64,
10
+ "merge_kernel_size": [
11
+ 2,
12
+ 2
13
+ ],
14
+ "merge_type": "sd2_tpool",
15
+ "mm_hidden_size": 1152,
16
+ "mm_projector_type": "patchmerger",
17
+ "patch_size": 14,
18
+ "pos_emb_type": "divided_fixed",
19
+ "projector_hidden_act": "gelu",
20
+ "projector_ln_eps": 1e-05,
21
+ "text_hidden_size": 7168,
22
+ "video_attn_type": "spatial_temporal",
23
+ "vt_hidden_size": 1152,
24
+ "vt_intermediate_size": 4304,
25
+ "vt_num_attention_heads": 16,
26
+ "vt_num_hidden_layers": 27
27
+ },
28
+ "projector": {
29
+ "type": "PatchMergerMLP",
30
+ "input_dim": 4608,
31
+ "hidden_dim": 4608,
32
+ "output_dim": 7168,
33
+ "pre_norm_eps": 1e-05
34
+ },
35
+ "num_tensors": 335,
36
+ "original_dtype": "bfloat16",
37
+ "media_placeholder_token_id": 163605
38
+ }
extract_vision_weights.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Build a local Kimi-K2.7-Code vision tower for exo, mirroring exolabs--Kimi-K2.6-vision.
3
+
4
+ Copies the 335 vision_tower.* / mm_projector.* tensors out of the official
5
+ moonshotai/Kimi-K2.7-Code download (shards 63+64) into a single safetensors
6
+ file, with a config.json adapted from the K2.6 tower. exo's downloader falls
7
+ back to local files when the repo doesn't exist on HF, so no upload is needed.
8
+
9
+ Run with exo's venv python (has safetensors):
10
+ ~/exo-next/.venv/bin/python ~/.exo/extract-k27-vision-tower.py
11
+ """
12
+
13
+ import json
14
+ from pathlib import Path
15
+
16
+ from safetensors import safe_open
17
+ from safetensors.numpy import save_file
18
+
19
+ MODELS = Path("/Volumes/LLM/exo-models")
20
+ SRC = MODELS / "moonshotai--Kimi-K2.7-Code"
21
+ K26_TOWER = MODELS / "exolabs--Kimi-K2.6-vision"
22
+ DST = MODELS / "aidiffuser--Kimi-K2.7-Code-vision"
23
+
24
+ SHARDS = ["model-00063-of-000064.safetensors", "model-00064-of-000064.safetensors"]
25
+ PREFIXES = ("vision_tower.", "mm_projector.")
26
+
27
+ for shard in SHARDS:
28
+ if not (SRC / shard).exists():
29
+ raise SystemExit(f"missing {shard} — download not finished yet")
30
+
31
+ tensors = {}
32
+ dtypes = set()
33
+ for shard in SHARDS:
34
+ # framework="np" keeps bf16 unsupported; use torch-free path via mlx instead
35
+ with safe_open(str(SRC / shard), framework="pt") as f:
36
+ for key in f.keys():
37
+ if key.startswith(PREFIXES):
38
+ t = f.get_tensor(key)
39
+ dtypes.add(str(t.dtype))
40
+ tensors[key] = t
41
+
42
+ print(f"extracted {len(tensors)} tensors, dtypes: {dtypes}")
43
+ assert len(tensors) == 335, f"expected 335 tensors, got {len(tensors)}"
44
+
45
+ DST.mkdir(exist_ok=True)
46
+
47
+ # safetensors.numpy can't write bf16; go through torch's save_file instead
48
+ from safetensors.torch import save_file as save_pt
49
+
50
+ save_pt(tensors, str(DST / "kimi_k27_vision.safetensors"))
51
+
52
+ cfg = json.loads((K26_TOWER / "config.json").read_text())
53
+ cfg["source_model"] = "moonshotai/Kimi-K2.7-Code"
54
+ cfg["description"] = (
55
+ "Vision-only weights extracted locally from Kimi-K2.7-Code for use with MLX/exo."
56
+ )
57
+ (DST / "config.json").write_text(json.dumps(cfg, indent=1) + "\n")
58
+
59
+ size = (DST / "kimi_k27_vision.safetensors").stat().st_size
60
+ print(f"wrote {DST} ({size / 1e9:.2f} GB) — done")
kimi_k27_vision.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aae39a3d166a65795fb62ad14c20f4b7fd840db209a9a016960afe1db02520bc
3
+ size 942326328