Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,8 @@ license_name: tencent-hunyuan-a13b
|
|
| 12 |
|
| 13 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5845">b5845</a> for quantization.
|
| 14 |
|
|
|
|
|
|
|
| 15 |
Original model: https://huggingface.co/tencent/Hunyuan-A13B-Instruct
|
| 16 |
|
| 17 |
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|
|
|
|
| 12 |
|
| 13 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5845">b5845</a> for quantization.
|
| 14 |
|
| 15 |
+
Using additionally this fork/PR for extra MoE performance: https://github.com/ggml-org/llama.cpp/pull/12727
|
| 16 |
+
|
| 17 |
Original model: https://huggingface.co/tencent/Hunyuan-A13B-Instruct
|
| 18 |
|
| 19 |
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|