Naphula commited on
Commit
aaca069
·
verified ·
1 Parent(s): 5f463e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -32,8 +32,21 @@ Tools to enhance LLM quantizations and merging
32
  # [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
33
  - Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
36
- - This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
37
 
38
  # [weight_counter.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/weight_counter.py)
39
  - This counts the number of models in a yaml and adds up the total weight values. Useful for large della/ties merges.
 
32
  # [metadata_audit.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/metadata_audit.py)
33
  - Checks multiple models within subdirectories for vocab or rope mismatch (useful for large merges). Calibrated for Mistral Nemo 12B by default.
34
 
35
+ # llama moe
36
+ - Add support for Llama Mixture of Experts. If you want to merge custom Llama MoE you can add these scripts to your mergekit environment:
37
+ - [mergekit-main\mergekit\architecture\moe_defs.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/moe_defs.py)
38
+ - [mergekit-main\mergekit\__init__.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/__init__.py)
39
+ - [mergekit-main\mergekit\moe\llama.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/llama.py)
40
+ - Then assign the num_experts_per_tok in config.json (or the config.yaml)
41
+
42
+ # [tokensurgeon.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokensurgeon.py)
43
+ - Uses adaptive VRAM from Grim Jim's `measure.py` like `graph_v18` to prevent OOM. Use recommended [batch file](https://huggingface.co/spaces/Naphula/model_tools/blob/main/fix_tokenizers.bat) here or modify sh. This supposedly avoids 'cardboard town' fake patches.
44
+
45
+ # [tokeninspector.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/tokensurgeon.py)
46
+ - Audit your tokensurgeon results.
47
+
48
  # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
49
+ - Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
50
 
51
  # [weight_counter.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/weight_counter.py)
52
  - This counts the number of models in a yaml and adds up the total weight values. Useful for large della/ties merges.