Upload README.md
Browse files
README.md
CHANGED
|
@@ -28,7 +28,7 @@ Original model: https://huggingface.co/Deci/DeciLM-7B-Instruct
|
|
| 28 |
|
| 29 |
```
|
| 30 |
|
| 31 |
-
[Modified llama.cpp](https://github.com/ymcki/llama.cpp-b4139) to support
|
| 32 |
|
| 33 |
Please note that the HF model of Deci-7B-Instruct uses dynamic NTK-ware RoPE scaling. However, llama.cpp doesn't support it yet, so my modifification also just ignore the dynamic NTK-ware RoPE scaling setting in the config.json. Since the ggufs seem working for the time being, please just use them as is until I figure out how to implement dynamic NTK-ware RoPE scaling.
|
| 34 |
|
|
|
|
| 28 |
|
| 29 |
```
|
| 30 |
|
| 31 |
+
[Modified llama.cpp](https://github.com/ymcki/llama.cpp-b4139) to support DeciLMForCausalLM's variable Grouped Query Attention. Please download it and compile it to run the GGUFs in this repository.
|
| 32 |
|
| 33 |
Please note that the HF model of Deci-7B-Instruct uses dynamic NTK-ware RoPE scaling. However, llama.cpp doesn't support it yet, so my modifification also just ignore the dynamic NTK-ware RoPE scaling setting in the config.json. Since the ggufs seem working for the time being, please just use them as is until I figure out how to implement dynamic NTK-ware RoPE scaling.
|
| 34 |
|