File size: 4,732 Bytes
9f7c6d7
0285718
9f7c6d7
 
 
251253d
9f7c6d7
 
 
dc8730a
02cdf96
dc8730a
458fd98
268ead7
458fd98
268ead7
d3eeb08
 
 
b37ea51
 
 
ca27cf1
 
 
24aa032
eb07f17
 
f131f6b
 
 
74d7f5a
99bd682
0b75d33
d2ed2f6
e4ec203
8cd4fe7
e2cf457
 
 
97948cc
 
 
4ce17bb
bd4dc44
 
59e230f
 
0b75d33
34b000c
59e230f
0b75d33
e4ec203
 
 
417b539
096ee84
417b539
8015f18
b37ea51
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
title: Model Tools
emoji: 📚
colorFrom: pink
colorTo: yellow
sdk: static
pinned: false
---

# Model Tools by Naphula
Tools to enhance LLM quantizations and merging

# [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
- Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
- Update: v18 is much faster than v4 and replaces the trial-and-error loop with an adaptive math-based calculator (using GrimJim's measure.py logic)

# config.py
- Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.

# [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
- Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.

# [fp32_to_bf16.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fp32_to_bf16.py)
- Converts FP32 to BF16 safetensors

# [fp32_to_fp16.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fp32_to_fp16.py)
- Converts FP32 to FP16 safetensors

# [pytorch_to_safetensors.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/pytorch_to_safetensors.py)
- Converts pytorch bin to safetensors format

# [textonly_ripper_v2.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/textonly_ripper_v2.py)
- Converts a sharded, multimodal (text and vision) model into a text-only version. Readme at [textonly_ripper.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/textonly_ripper.md)

# [vocab_resizer.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_resizer.py)
- Converts models with larger vocab_sizes to a standard size (default 131072 Mistral 24B) for use with mergekit. Note that `tokenizer.model` must be manually copied into the `/fixed/` folder.

# [lm_head_remover.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/lm_head_remover.py)
- This script will load a "fat" 18.9GB model (default Gemma 9B), force it to tie the weights (deduplicating the lm_head), and re-save it. This will drop the file size to ~17.2GB and make it compatible with the others.

# [model_index_json_generator.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/model_index_json_generator.py)
- Generates a missing `model.safetensors.index.json` file. Useful for cases where safetensors may have been sharded at the wrong size.

# [folder_content_combiner_anyfiles.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/folder_content_combiner_anyfiles.py)
- Combines all files in the script's current directory into a single output file, sorted alphabetically.

# [GGUF Repo Suite](https://huggingface.co/spaces/Naphula/gguf-repo-suite)
- Create and quantize Hugging Face models

# [Markdown Viewer](https://huggingface.co/spaces/Naphula/Portable_Offline_Markdown_Viewer)
- Portable Offline Markdown Viewer

# [Markdown to SMF](https://huggingface.co/spaces/Naphula/model_tools/blob/main/md_to_smf.py)
- Converts a Markdown string to an SMF-compatible BBCode string. Not perfect—sometimes misses double bold tags.

# [Quant Clone](https://github.com/electroglyph/quant_clone)
- A tool which allows you to recreate UD quants such as Q8_K_XL. Examples: [Mistral 24B](https://huggingface.co/spaces/Naphula/model_tools/raw/main/Mistral-Small-3.2-24B-Instruct-2506-UD-Q8_K_XL_UD.txt), [Mistral 7B](https://huggingface.co/spaces/Naphula/model_tools/raw/main/Warlock-7B-v2-Q8_K_XL.txt)

# [Text Analysis Suite v1.5](https://huggingface.co/spaces/Naphula/TAS_1.5)
- Analyze text files with advanced metrics

---

# Not Functional

# [Failed Experiment gguf_to_safetensors_v2.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gguf_to_safetensors_v2.py)
- Unsuccessful attempt by Gemini to patch the gguf_to_safetensors script. Missing json files are hard to reconstruct. Also see [safetensors_meta_ripper_v1.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/safetensors_meta_ripper_v1.py) and [tokenizer_ripper_v1.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokenizer_ripper_v1.py)

# [IQ5_NL.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/IQ5_NL.md)
- Note: Not functional yet. Includes the code needed to quantize IQ5_NL GGUFs using block size 32.