Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -32,8 +32,21 @@ Tools to enhance LLM quantizations and merging
|
|
| 32 |
# [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
|
| 33 |
- Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
# [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
|
| 36 |
-
- This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
|
| 37 |
|
| 38 |
# [weight_counter.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/weight_counter.py)
|
| 39 |
- This counts the number of models in a yaml and adds up the total weight values. Useful for large della/ties merges.
|
|
|
|
| 32 |
# [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
|
| 33 |
- Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.
|
| 34 |
|
| 35 |
+
# llama moe
|
| 36 |
+
- Add support for Llama Mixture of Experts. If you want to merge custom Llama MoE you can add these scripts to your mergekit environment:
|
| 37 |
+
- [mergekit-main\mergekit\architecture\moe_defs.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/moe_defs.py)
|
| 38 |
+
- [mergekit-main\mergekit\__init__.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/__init__.py)
|
| 39 |
+
- [mergekit-main\mergekit\moe\llama.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/llama.py)
|
| 40 |
+
- Then assign the num_experts_per_tok in config.json (or the config.yaml)
|
| 41 |
+
|
| 42 |
+
# [tokensurgeon.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokensurgeon.py)
|
| 43 |
+
- Uses adaptive VRAM from Grim Jim's `measure.py` like `graph_v18` to prevent OOM. Use recommended [batch file](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fix_tokenizers.bat) here or modify sh. This supposedly avoids 'cardboard town' fake patches.
|
| 44 |
+
|
| 45 |
+
# [tokeninspector.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokensurgeon.py)
|
| 46 |
+
- Audit your tokensurgeon results.
|
| 47 |
+
|
| 48 |
# [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
|
| 49 |
+
- Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
|
| 50 |
|
| 51 |
# [weight_counter.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/weight_counter.py)
|
| 52 |
- This counts the number of models in a yaml and adds up the total weight values. Useful for large della/ties merges.
|