Upload README.md
Browse files
README.md
CHANGED
|
@@ -28,12 +28,9 @@ Original model: https://huggingface.co/Deci/DeciLM-7B-Instruct
|
|
| 28 |
|
| 29 |
```
|
| 30 |
|
| 31 |
-
Modified llama.cpp to support DeciLMCausalModel's variable Grouped Query Attention.
|
| 32 |
-
|
| 33 |
-
However, llama.cpp doesn't support it yet, so my modifification also just ignore
|
| 34 |
-
the dynamic NTK-ware RoPE scaling settings in the config.json. Since the ggufs seem
|
| 35 |
-
working, for the time being just use them as is until I figure out how to implement
|
| 36 |
-
dynamic NTK-ware RoPE scaling.
|
| 37 |
|
| 38 |
## Download a file (not the whole branch) from below:
|
| 39 |
|
|
|
|
| 28 |
|
| 29 |
```
|
| 30 |
|
| 31 |
+
[Modified llama.cpp](https://github.com/ymcki/llama.cpp-b4139) to support DeciLMCausalModel's variable Grouped Query Attention. Please download it and compile it to run the GGUFs in this repository.
|
| 32 |
+
|
| 33 |
+
Please note that the HF model of Deci-7B-Instruct uses dynamic NTK-ware RoPE scaling. However, llama.cpp doesn't support it yet, so my modifification also just ignore the dynamic NTK-ware RoPE scaling setting in the config.json. Since the ggufs seem working for the time being, please just use them as is until I figure out how to implement dynamic NTK-ware RoPE scaling.
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
## Download a file (not the whole branch) from below:
|
| 36 |
|