Spaces:
Running
Running
File size: 4,732 Bytes
9f7c6d7 0285718 9f7c6d7 251253d 9f7c6d7 dc8730a 02cdf96 dc8730a 458fd98 268ead7 458fd98 268ead7 d3eeb08 b37ea51 ca27cf1 24aa032 eb07f17 f131f6b 74d7f5a 99bd682 0b75d33 d2ed2f6 e4ec203 8cd4fe7 e2cf457 97948cc 4ce17bb bd4dc44 59e230f 0b75d33 34b000c 59e230f 0b75d33 e4ec203 417b539 096ee84 417b539 8015f18 b37ea51 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
title: Model Tools
emoji: 📚
colorFrom: pink
colorTo: yellow
sdk: static
pinned: false
---
# Model Tools by Naphula
Tools to enhance LLM quantizations and merging
# [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
- Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
- Update: v18 is much faster than v4 and replaces the trial-and-error loop with an adaptive math-based calculator (using GrimJim's measure.py logic)
# config.py
- Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
# [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
- Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.
# [fp32_to_bf16.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fp32_to_bf16.py)
- Converts FP32 to BF16 safetensors
# [fp32_to_fp16.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fp32_to_fp16.py)
- Converts FP32 to FP16 safetensors
# [pytorch_to_safetensors.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/pytorch_to_safetensors.py)
- Converts pytorch bin to safetensors format
# [textonly_ripper_v2.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/textonly_ripper_v2.py)
- Converts a sharded, multimodal (text and vision) model into a text-only version. Readme at [textonly_ripper.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/textonly_ripper.md)
# [vocab_resizer.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_resizer.py)
- Converts models with larger vocab_sizes to a standard size (default 131072 Mistral 24B) for use with mergekit. Note that `tokenizer.model` must be manually copied into the `/fixed/` folder.
# [lm_head_remover.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/lm_head_remover.py)
- This script will load a "fat" 18.9GB model (default Gemma 9B), force it to tie the weights (deduplicating the lm_head), and re-save it. This will drop the file size to ~17.2GB and make it compatible with the others.
# [model_index_json_generator.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/model_index_json_generator.py)
- Generates a missing `model.safetensors.index.json` file. Useful for cases where safetensors may have been sharded at the wrong size.
# [folder_content_combiner_anyfiles.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/folder_content_combiner_anyfiles.py)
- Combines all files in the script's current directory into a single output file, sorted alphabetically.
# [GGUF Repo Suite](https://huggingface.co/spaces/Naphula/gguf-repo-suite)
- Create and quantize Hugging Face models
# [Markdown Viewer](https://huggingface.co/spaces/Naphula/Portable_Offline_Markdown_Viewer)
- Portable Offline Markdown Viewer
# [Markdown to SMF](https://huggingface.co/spaces/Naphula/model_tools/blob/main/md_to_smf.py)
- Converts a Markdown string to an SMF-compatible BBCode string. Not perfect—sometimes misses double bold tags.
# [Quant Clone](https://github.com/electroglyph/quant_clone)
- A tool which allows you to recreate UD quants such as Q8_K_XL. Examples: [Mistral 24B](https://huggingface.co/spaces/Naphula/model_tools/raw/main/Mistral-Small-3.2-24B-Instruct-2506-UD-Q8_K_XL_UD.txt), [Mistral 7B](https://huggingface.co/spaces/Naphula/model_tools/raw/main/Warlock-7B-v2-Q8_K_XL.txt)
# [Text Analysis Suite v1.5](https://huggingface.co/spaces/Naphula/TAS_1.5)
- Analyze text files with advanced metrics
---
# Not Functional
# [Failed Experiment gguf_to_safetensors_v2.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gguf_to_safetensors_v2.py)
- Unsuccessful attempt by Gemini to patch the gguf_to_safetensors script. Missing json files are hard to reconstruct. Also see [safetensors_meta_ripper_v1.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/safetensors_meta_ripper_v1.py) and [tokenizer_ripper_v1.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokenizer_ripper_v1.py)
# [IQ5_NL.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/IQ5_NL.md)
- Note: Not functional yet. Includes the code needed to quantize IQ5_NL GGUFs using block size 32. |